veyron: Properly avoid A12/A17 CPU errata in SkAnalyticEdge::setLine |
|||||||||||
Issue descriptionRecent addition of omit-frame-pointers to the default cflags for the build appeared to trigger this issue; the flag has temporarily been removed pending a toolchain fix. See issue 710131 for the detailed crasher bug and analysis.
,
Apr 14 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bec44505ea17b6b8fbd366ab22ca724f3c448fe6 commit bec44505ea17b6b8fbd366ab22ca724f3c448fe6 Author: wez <wez@chromium.org> Date: Fri Apr 14 23:55:20 2017 Remove explicit -fomit_frame_pointer from ARM 32-bit builds. -fomit_frame_pointer causes the assembly generated from SkEdge::setLine to trigger a CPU errata in ARM A12/A17 devices, so we are temporarily removing the flag until the toolchain can be fixed to avoid that. Revert this when issue 711784 is resolved. BUG= 710131 , 706654 , 711784 Review-Url: https://codereview.chromium.org/2820803003 Cr-Commit-Position: refs/heads/master@{#464832} [modify] https://crrev.com/bec44505ea17b6b8fbd366ab22ca724f3c448fe6/build/config/compiler/BUILD.gn
,
Apr 17 2017
We have "fixed" the hangs we were seeing by reverting a compiler flags change. However: * We actually want that compiler flag change. We should save space/speed by avoiding frame pointers in Chrome and apparently folks think they provide no benefit. * The "fix" was really just dumb luck. Any other minor change in code (or any Android app) could trigger this errata again. To REALLY be sure that the problem is in SkAnalyticEdge::setLine, AKA <https://cs.chromium.org/chromium/src/third_party/skia/src/core/SkAnalyticEdge.h?type=cs&l=133>, I poked at that function a bit and could make the problem go away. See attached files where I show the disassembly of this function over several minor changes. Basically: 1. Changing Chrome to -fno-omit_frame_pointer "fixes" the problem. 2. Adding a "nop" or similar after the calculation of "x0" "fixes" the problem. 3. Adding "nop"s in other places doesn't fix the problem. --- I'll keep my system ready to reproduce this problem for when ARM has a chance to suggest some Chicken Bits to try. If there are no Chicken Bits to try and you want me to try some specific assembly changes, let me know. I can poke at the binary a bit with a hex editor, but it get a little tedious and error prone.
,
Apr 17 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3dcba7c1facb5bb49535476185c5a396669d52f0 commit 3dcba7c1facb5bb49535476185c5a396669d52f0 Author: Wez <wez@chromium.org> Date: Mon Apr 17 22:20:54 2017 Remove explicit -fomit_frame_pointer from ARM 32-bit builds. -fomit_frame_pointer causes the assembly generated from SkEdge::setLine to trigger a CPU errata in ARM A12/A17 devices, so we are temporarily removing the flag until the toolchain can be fixed to avoid that. Revert this when issue 711784 is resolved. BUG= 710131 , 706654 , 711784 Review-Url: https://codereview.chromium.org/2820803003 Cr-Commit-Position: refs/heads/master@{#464832} (cherry picked from commit bec44505ea17b6b8fbd366ab22ca724f3c448fe6) Review-Url: https://codereview.chromium.org/2819373002 . Cr-Commit-Position: refs/branch-heads/3071@{#26} Cr-Branched-From: a106f0abbf69dad349d4aaf4bcc4f5d376dd2377-refs/heads/master@{#464641} [modify] https://crrev.com/3dcba7c1facb5bb49535476185c5a396669d52f0/build/config/compiler/BUILD.gn
,
Apr 17 2017
,
Apr 19 2017
A few experiments based on an analysis of the code sequence by ARM. p15, 0, c15, c0, 1. Could try to set the following in addition to the one already set · bit 14 to 16, independently. These bit disable forwarding of Neon/Floating point results at different stages of the Issue of these instructions. · bit 27: this bit disables on renaming optimisation on back to back MOV optimisation. The following code sequence might trigger use of this optimisation: 0x01f6bd5e <+138>: mov.w r2, r2, lsl #10 0x01f6bd62 <+142>: mov.w r3, r2, asr #2 The feature register (p15, 0, c15, c0, 2) *shouldn't* have any effect seeing the code sequence. However, after the above, we could try setting bit 1, 2, 3, 5, 6 and 7 together.
,
Apr 19 2017
@6: I tried all those and none of them helped. :( With each of the below I could still reproduce the problem. --- bit 14: # for i in $(seq 0 3); do taskset -c $i cat /sys/kernel/debug/arm_coprocessor_debug; done CPU 0: diag register: (p15, 0, c15, c0, 1): 0x01005400 CPU 0: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 1: diag register: (p15, 0, c15, c0, 1): 0x01005400 CPU 1: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 2: diag register: (p15, 0, c15, c0, 1): 0x01005400 CPU 2: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 3: diag register: (p15, 0, c15, c0, 1): 0x01005400 CPU 3: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 No luck --- bit 15: CPU 0: diag register: (p15, 0, c15, c0, 1): 0x01009400 CPU 0: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 1: diag register: (p15, 0, c15, c0, 1): 0x01009400 CPU 1: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 2: diag register: (p15, 0, c15, c0, 1): 0x01009400 CPU 2: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 3: diag register: (p15, 0, c15, c0, 1): 0x01009400 CPU 3: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 No luck --- bit 16: CPU 0: diag register: (p15, 0, c15, c0, 1): 0x01011400 CPU 0: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 1: diag register: (p15, 0, c15, c0, 1): 0x01011400 CPU 1: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 2: diag register: (p15, 0, c15, c0, 1): 0x01011400 CPU 2: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 3: diag register: (p15, 0, c15, c0, 1): 0x01011400 CPU 3: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 --- bit 27: CPU 0: diag register: (p15, 0, c15, c0, 1): 0x09001400 CPU 0: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 1: diag register: (p15, 0, c15, c0, 1): 0x09001400 CPU 1: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 2: diag register: (p15, 0, c15, c0, 1): 0x09001400 CPU 2: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 CPU 3: diag register: (p15, 0, c15, c0, 1): 0x09001400 CPU 3: int feat reg: (p15, 0, c15, c0, 2): 0x00000002 No luck --- (p15, 0, c15, c0, 2), bit 1, 2, 3, 5, 6 and 7 together: CPU 0: diag register: (p15, 0, c15, c0, 1): 0x01001400 CPU 0: int feat reg: (p15, 0, c15, c0, 2): 0x000000ee CPU 1: diag register: (p15, 0, c15, c0, 1): 0x01001400 CPU 1: int feat reg: (p15, 0, c15, c0, 2): 0x000000ee CPU 2: diag register: (p15, 0, c15, c0, 1): 0x01001400 CPU 2: int feat reg: (p15, 0, c15, c0, 2): 0x000000ee CPU 3: diag register: (p15, 0, c15, c0, 1): 0x01001400 CPU 3: int feat reg: (p15, 0, c15, c0, 2): 0x000000ee No luck
,
Apr 19 2017
Discussing this offline with a few folks: - Currently ChromeOS builds w/ frame pointers, to support CWP. - Erik's patch switched off frame pointers in Chrome, so we broke CWP. - My patch only re-enables them for ARM/Thumb, so still leaves CWP mostly broken. Since we want CWP working, we should tweak the Chrome compiler.gni to accept an enable_frame_pointers argument, with the default setting based on the current enabled_frame_pointers logic, but have ChromeOS explicitly set that. dianders: This will mean we don't actually hit this errata, but I expect we still want to follow-up on it to make sure we don't hit issues in future? If so then I'll break the argument work out into Yet Another Bug.
,
Apr 19 2017
@wez: Let's try to keep this bug about dealing with the CPU errata. If we want to fix CWP can we make a new bug for that? ...as long as ARM/Thumb keeps frame pointers (for now) then we won't hit this bug. When we fix this bug then we can evaluate if we should turn frame pointers off for Thumb.
,
Apr 20 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/38f02400540606a7c418f9fa5148d3a07fd8b4fa commit 38f02400540606a7c418f9fa5148d3a07fd8b4fa Author: wez <wez@chromium.org> Date: Thu Apr 20 18:41:09 2017 Enable frame pointers explicitly under ARM32. This replaces the quick-fix in crrev.com/2820803003, to address issues with ARM32 builds when frame pointers are disabled. This CL explicitly enables frame pointers in ARM32 builds, and pulls out the ARM32 and ARM64 special-cases to be handling separately from Debug, profiling and sanitizer build configurations, for clarity. BUG= 710131 , 706654 , 711784 Review-Url: https://codereview.chromium.org/2829433003 Cr-Commit-Position: refs/heads/master@{#466080} [modify] https://crrev.com/38f02400540606a7c418f9fa5148d3a07fd8b4fa/build/config/compiler/BUILD.gn [modify] https://crrev.com/38f02400540606a7c418f9fa5148d3a07fd8b4fa/build/config/compiler/compiler.gni
,
Apr 20 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1e22c07d34fdb782f532155ae5fca2f048a5e2f3 commit 1e22c07d34fdb782f532155ae5fca2f048a5e2f3 Author: Wez <wez@chromium.org> Date: Thu Apr 20 20:46:57 2017 Enable frame pointers explicitly under ARM32. This replaces the quick-fix in crrev.com/2820803003, to address issues with ARM32 builds when frame pointers are disabled. This CL explicitly enables frame pointers in ARM32 builds, and pulls out the ARM32 and ARM64 special-cases to be handling separately from Debug, profiling and sanitizer build configurations, for clarity. BUG= 710131 , 706654 , 711784 Review-Url: https://codereview.chromium.org/2829433003 Cr-Commit-Position: refs/heads/master@{#466080} (cherry picked from commit 38f02400540606a7c418f9fa5148d3a07fd8b4fa) Review-Url: https://codereview.chromium.org/2831733005 . Cr-Commit-Position: refs/branch-heads/3071@{#98} Cr-Branched-From: a106f0abbf69dad349d4aaf4bcc4f5d376dd2377-refs/heads/master@{#464641} [modify] https://crrev.com/1e22c07d34fdb782f532155ae5fca2f048a5e2f3/build/config/compiler/BUILD.gn [modify] https://crrev.com/1e22c07d34fdb782f532155ae5fca2f048a5e2f3/build/config/compiler/compiler.gni
,
Apr 21 2017
FYI that it appears that we've regressed again, even with the frame pointers. Digging happening in bug #710131
,
Apr 24 2017
Here's a whole bunch of tests that I did today. A few notes: 1. I put 11 nops in in SkAnalyticEdge::setLine() at a certain place and problem still reproduced. 2. I then did some object code patching (see a, b, and c). Some of the instruction swaps I did "fixed" the problem. 3. I tried replacing one of the nops with a "dsb". This also fixed the problem (though it also changed alignment since dsb is 4 bytes and nop is 2). 4. I tried just a single dsb very early, then 2 and 3 nops early. For whatever reason these all fixed the crashes. 5. I tried one and two nops in the same place I had earlier put 11 nops. Problem reproduced. ...but problem was _fixed_ with a dsb(). === Possibly a dsb() instruction early in the function (like "dsb-early") might be the safest way to avoid the problem for now? I'll put that suggestion in bug #710131 .
,
Apr 25 2017
,
Apr 25 2017
Grace: probably just bug #710131 needs to be M-59. That bug is about mitigations. This bug is about solving the root cause.
,
Apr 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8f2e1bde3eb4c84fcfe13962a2f6b48223526f9b commit 8f2e1bde3eb4c84fcfe13962a2f6b48223526f9b Author: wez <wez@chromium.org> Date: Wed Apr 26 18:28:23 2017 Remove enable_frame_pointers and enable frame-pointers under ChromeOS. As suggested by thakis@, this avoids adding yet another build argument which can potentially be set inconsistently with other things. Since ChromeOS needs frame-pointers enabled in x64 builds, to support CWP, and temporarily in ARM/Thumb builds, to avoid CPU errata, we now always enable them under ChromeOS. The ChromeOS toolchain already set frame-pointers always on so this restores the pre-M59 behaviour. BUG=BUG= 710131 , 706654 , 711784 Review-Url: https://codereview.chromium.org/2845433002 Cr-Commit-Position: refs/heads/master@{#467386} [modify] https://crrev.com/8f2e1bde3eb4c84fcfe13962a2f6b48223526f9b/build/config/compiler/compiler.gni
,
Jun 20 2017
Though this bug has been quiet, we've been doing tons of work in the background and we're finally making progress.
Here's a summary of what was found and how it was found in case that helps for documenting the problem or tracking down future errata.
=============
ARM requested some traces and information about the state of the chip, but we couldn't get that. Unfortunately Rockchip rk3288 seems to have not implemented some of the stuff that ARM needed. ...in addition, it seems like maybe the JTAG connection that was available on production boards wasn't enough to get traces (??). Sigh.
I've been attempting to make progress through other means. The one that seemed to work best was object code patching. I started with build 9454 (one that was known to reproduce well) on veyron-minnie and started poking.
---
General themes used for debugging:
* Any small change to the assembly or alignment could potentially affect the errata. It's also hard to build the exact same Chrome binary as the builder. Thus the least intrusive way to change things was via object code patching. I'll post details on how I did the patching shortly, but generally I found that I could replace an instruction with a "branch" to a section of memory that was unused and include a copy of the original instruction plus my patch, then a branch back. Experimental data showed that we were lucky and this extra branching didn't seem to affect our ability to trip the errata.
* The one piece of information that we got from crashes was the DBGPCSR, which only gets updated on a taken branch / branch-link. By adjusting where branches happen and observing the resulting DBGPCSR we can get more information about the crash.
* Since it appeared that a "dsb" (data synchronization barrier) in the right place avoided the problem (found by educated guessing), we could attempt to trace codepaths by adding a "dsb" instruction in various places. If that "dsb" made the problem go away then it's likely that we were in an important code path of the errata.
* By luck, Jeffy at Rockchip found that chicken bit 11 seemed to fix the problem. As per ARM, this chicken bit affects how conditional instructions are executed. Thus we focused more on conditional instructions, AKA anywhere you see "it, itt, ittt, itttt, ite, itee, iteee" in the disassembly.
---
To find a bit of "unused" memory to use for object code patching, I decided that I could probably overwrite a string (especially if it looked like an error message that was added to a log or displayed to the user). Messing with this would be unlikely to cause problems. I found strings like this, then looked for ones that were nearby the SKIA functions we were focusing on.
# strings -tx r/opt/google/chrome/chrome | less
> 1f1e960 // Distance to the edge encoded in the z-component
>
> 1f1fc3c Unknown shader var type.
>
> 1f755ec Not all Y tile cases covered.
> 1f756bc Not all X tile cases covered.
I picked the last two.
=============
Test: which call site is important?
Using ideas above, I can show that the failing call to setLine is the one that happens at:
1f6c21a: f7ff fd5b bl 1f6bcd4 <_ZN14SkAnalyticEdge7setLineERK7SkPointS2_>
Specifically I changed that instruction to:
1f6c21a: f009 f9e7 bl 1f755ec <victimstring>
01f755ec <victimstring>:
1f755ec: e92d 4ff0 stmdb sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
1f755f0: b083 sub sp, #12
1f755f2: 4604 mov r4, r0
1f755f4: 2000 movs r0, #0
1f755f6: 60a0 str r0, [r4, #8]
1f755f8: f643 73ff movw r3, #16383 ; 0x3fff
1f755fc: eeb1 0a00 vmov.f32 s0, #16
1f75600: f7f6 bb72 b.w 1f6bce8 <finaldest>
AKA I copied the first few instructions of "setLine" to the victimstring location, then skipped them (but only from the 0x1f6c21a call site). For reference, here's the start of setLine:
1f6bcd4: e92d 4ff0 stmdb sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
1f6bcd8: b083 sub sp, #12
1f6bcda: 4604 mov r4, r0
1f6bcdc: 2000 movs r0, #0
1f6bcde: 60a0 str r0, [r4, #8]
1f6bce0: f643 73ff movw r3, #16383 ; 0x3fff
1f6bce4: eeb1 0a00 vmov.f32 s0, #16
1f6bce8: ed92 2a01 vldr s4, [r2, #4]
When I ran this experiment, my hang reported DBGPCSR as 0x3a80dce8 (last 3 nybbles match the branch from victimstring). That means that the last branch that happened before the hang was the one from "victimstring" and it means that was the call site.
Specifically note that the _other_ callsite to setLine() (for whatever reason) didn't trip the errata.
---
Test: Show which exact instruction we're hanging on
Through lots of trial and error, I was able to identify the exact failing instruction. Here are just the experiments needed to show where we were:
First, narrow down to two instructions
Original set of instructions (in setLine):
1f6bd7a: ebb0 2fa7 cmp.w r0, r7, asr #10
1f6bd7e: f000 80c2 beq.w 1f6bf06 <_ZN14SkAnalyticEdge7setLineERK7SkPointS2_+0x232>
1f6bd82: 45c1 cmp r9, r8
New instructions:
1f6bd7a: f009 bc37 b.w 1f755ec <victimstring>
01f755ec <victimstring>:
1f755ec: ebb0 2fa7 cmp.w r0, r7, asr #10
1f755f0: f436 ac89 beq.w 1f6bf06 <branchdest>
1f755f4: f7f6 bbc5 b.w 1f6bd82 <back_to_setline>
DBGPCSR at time of hang: 0x306ec5ec
Thus the branch to "victimstring" ran (last 3 nybbles of 5ec match 5ec from victimstring), but not the branch to "back_to_setline". That means we crashed on one of the two instructions.
Now, eliminate one of the two instructions
Original set of instructions (in setLine):
1f6bd7a: ebb0 2fa7 cmp.w r0, r7, asr #10
1f6bd7e: f000 80c2 beq.w 1f6bf06 <_ZN14SkAnalyticEdge7setLineERK7SkPointS2_+0x232>
1f6bd82: 45c1 cmp r9, r8
New instructions:
1f6bd7a: f009 bc37 b.w 1f755ec <victimstring>
01f755ec <victimstring>:
1f755ec: ebb0 2fa7 cmp.w r0, r7, asr #10
1f755f0: f7f6 bbc5 b.w 1f6bd7e <back_to_setline>
DBGPCSR at time of hang: 0x356bed7e
Thus the branch to "back_to_setline" ran (compare last 3 nybbles). That means the "cmp.w r0, r7, asr #10" didn't cause the crash.
---
Test: patch in "dsb" instructions in various places in the callsite to setLine() and see which ones help / don't help
The net result of this was to realize that setLine() was called in a loop and that "dsb" instructions that were placed _between_ the calls to setLine() were the ones that made the problem go away.
01f6c0e8 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib>:
1f6c0e8: e92d 4ff0 stmdb sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
1f6c0ec: af03 add r7, sp, #12
1f6c0ee: b081 sub sp, #4
1f6c0f0: ed2d 8b02 vpush {d8}
1f6c0f4: b0a8 sub sp, #160 ; 0xa0
1f6c0f6: 466c mov r4, sp
1f6c0f8: f36f 0403 bfc r4, #0, #4
1f6c0fc: 46a5 mov sp, r4
1f6c0fe: 4606 mov r6, r0
1f6c100: f8df 04b8 ldr.w r0, [pc, #1208] ; 1f6c5bc <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x4d4>
# dsb here doesn't help
1f6c104: 4691 mov r9, r2
1f6c106: 2201 movs r2, #1
1f6c108: 4478 add r0, pc
1f6c10a: 4698 mov r8, r3
1f6c10c: 6800 ldr r0, [r0, #0]
1f6c10e: 460d mov r5, r1
1f6c110: 6800 ldr r0, [r0, #0]
1f6c112: 9027 str r0, [sp, #156] ; 0x9c
1f6c114: a80e add r0, sp, #56 ; 0x38
1f6c116: f707 fe65 bl 1e73de4 <_ZN6SkPath4Iter7setPathERKS_b>
# dsb here doesn't help
1f6c11a: 4628 mov r0, r5
1f6c11c: f10f ef82 blx 207c024 <_ZNK2ui26OSExchangeDataProviderAura5CloneEv+0x1fd8>
1f6c120: 4605 mov r5, r0
1f6c122: f896 0034 ldrb.w r0, [r6, #52] ; 0x34
1f6c126: f1b9 0f00 cmp.w r9, #0
1f6c12a: f04f 011c mov.w r1, #28
1f6c12e: bf18 it ne
1f6c130: eb05 0545 addne.w r5, r5, r5, lsl #1
1f6c134: 2800 cmp r0, #0
1f6c136: ebc5 04c5 rsb r4, r5, r5, lsl #3
1f6c13a: bf18 it ne
1f6c13c: 2138 movne r1, #56 ; 0x38
1f6c13e: 9104 str r1, [sp, #16]
1f6c140: b140 cbz r0, 1f6c154 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x6c>
1f6c142: 00e1 lsls r1, r4, #3
1f6c144: 4630 mov r0, r6
1f6c146: 2204 movs r2, #4
1f6c148: f6e6 fa38 bl 1e525bc <_ZN12SkArenaAlloc11allocObjectEjj>
1f6c14c: 4683 mov fp, r0
1f6c14e: eb0b 00c4 add.w r0, fp, r4, lsl #3
1f6c152: e007 b.n 1f6c164 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x7c>
1f6c154: 00a1 lsls r1, r4, #2
1f6c156: 4630 mov r0, r6
1f6c158: 2204 movs r2, #4
1f6c15a: f6e6 fa2f bl 1e525bc <_ZN12SkArenaAlloc11allocObjectEjj>
1f6c15e: 4683 mov fp, r0
1f6c160: eb0b 0084 add.w r0, fp, r4, lsl #2
1f6c164: 6070 str r0, [r6, #4]
1f6c166: 00a9 lsls r1, r5, #2
1f6c168: 4630 mov r0, r6
1f6c16a: 2204 movs r2, #4
1f6c16c: f6e6 fa26 bl 1e525bc <_ZN12SkArenaAlloc11allocObjectEjj>
# dsb here doesn't help
1f6c170: eb00 0185 add.w r1, r0, r5, lsl #2
1f6c174: 4634 mov r4, r6
1f6c176: f1b9 0f00 cmp.w r9, #0
1f6c17a: 6071 str r1, [r6, #4]
1f6c17c: 4601 mov r1, r0
1f6c17e: 9106 str r1, [sp, #24]
1f6c180: 62f0 str r0, [r6, #44] ; 0x2c
1f6c182: 9407 str r4, [sp, #28]
1f6c184: f000 811a beq.w 1f6c3bc <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2d4>
1f6c188: eea0 8b90 vdup.32 q8, r8
1f6c18c: f108 0006 add.w r0, r8, #6
1f6c190: fff9 03e0 vneg.s32 q8, q8
1f6c194: 2101 movs r1, #1
1f6c196: f969 2a8f vld1.32 {d18-d19}, [r9]
1f6c19a: fa01 f000 lsl.w r0, r1, r0
# dsb here doesn't help
1f6c19e: f10d 085c add.w r8, sp, #92 ; 0x5c
1f6c1a2: f8d7 a018 ldr.w sl, [r7, #24]
1f6c1a6: ef60 04e2 vshl.s32 q8, q9, q8
1f6c1aa: f108 0108 add.w r1, r8, #8
1f6c1ae: f10d 0920 add.w r9, sp, #32
1f6c1b2: 9102 str r1, [sp, #8]
1f6c1b4: fffb 0660 vcvt.f32.s32 q8, q8
1f6c1b8: f949 0aef vst1.64 {d16-d17}, [r9 :128]
# dsb here doesn't help
1f6c1bc: ee00 0a10 vmov s0, r0
1f6c1c0: eeb8 8ac0 vcvt.f32.s32 s16, s0
# dsb here doesn't help
1f6c1c4: e005 b.n 1f6c1d2 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0xea>
# dsb here--can't reproduce
1f6c1c6: 69bd ldr r5, [r7, #24]
1f6c1c8: ae08 add r6, sp, #32
1f6c1ca: 46b1 mov r9, r6
1f6c1cc: 46aa mov sl, r5
1f6c1ce: ad17 add r5, sp, #92 ; 0x5c
1f6c1d0: 46a8 mov r8, r5
# dsb here--can't reproduce
1f6c1d2: ae0e add r6, sp, #56 ; 0x38
1f6c1d4: ad1f add r5, sp, #124 ; 0x7c
1f6c1d6: 4630 mov r0, r6
1f6c1d8: 4629 mov r1, r5
# dsb here--can't reproduce
1f6c1da: f707 ff65 bl 1e740a8 <_ZN6SkPath4Iter6doNextEP7SkPoint>
1f6c1de: 2806 cmp r0, #6
1f6c1e0: f000 81d6 beq.w 1f6c590 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x4a8>
1f6c1e4: 2801 cmp r0, #1
1f6c1e6: d1f6 bne.n 1f6c1d6 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0xee>
1f6c1e8: 4628 mov r0, r5
1f6c1ea: 4649 mov r1, r9
1f6c1ec: 4642 mov r2, r8
1f6c1ee: 4653 mov r3, sl
# dsb here--can't reproduce
1f6c1f0: f01a fa3c bl 1f8666c <_ZN13SkLineClipper8ClipLineEPK7SkPointRK6SkRectPS0_b>
1f6c1f4: 4680 mov r8, r0
1f6c1f6: f1b8 0f01 cmp.w r8, #1
1f6c1fa: dbe4 blt.n 1f6c1c6 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0xde>
1f6c1fc: f8dd 9008 ldr.w r9, [sp, #8]
1f6c200: e003 b.n 1f6c20a <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x122>
1f6c202: 9806 ldr r0, [sp, #24]
1f6c204: 3804 subs r0, #4
1f6c206: 9006 str r0, [sp, #24]
1f6c208: e00c b.n 1f6c224 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x13c>
1f6c20a: f894 0034 ldrb.w r0, [r4, #52] ; 0x34
1f6c20e: 46da mov sl, fp
1f6c210: b150 cbz r0, 1f6c228 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x140>
1f6c212: f1a9 0108 sub.w r1, r9, #8
1f6c216: 4650 mov r0, sl
1f6c218: 464a mov r2, r9
1f6c21a: f7ff fd5b bl 1f6bcd4 <_ZN14SkAnalyticEdge7setLineERK7SkPointS2_>
## NOTE: 1f6c21a is the important call to "setLine"
1f6c21e: 2801 cmp r0, #1
1f6c220: f000 8092 beq.w 1f6c348 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x260>
...
...
# dsb here--can't reproduce
1f6c348: f894 0034 ldrb.w r0, [r4, #52] ; 0x34
1f6c34c: b190 cbz r0, 1f6c374 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x28c>
1f6c34e: f8da 0010 ldr.w r0, [sl, #16]
1f6c352: bb28 cbnz r0, 1f6c3a0 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2b8>
1f6c354: f89a 0034 ldrb.w r0, [sl, #52] ; 0x34
1f6c358: bb10 cbnz r0, 1f6c3a0 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2b8>
1f6c35a: 6ae0 ldr r0, [r4, #44] ; 0x2c
1f6c35c: 9906 ldr r1, [sp, #24]
1f6c35e: 4288 cmp r0, r1
1f6c360: d21e bcs.n 1f6c3a0 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2b8>
1f6c362: 9806 ldr r0, [sp, #24]
1f6c364: 4651 mov r1, sl
1f6c366: f850 2c04 ldr.w r2, [r0, #-4]
1f6c36a: f7ff fb31 bl 1f6b9d0 <_ZN13SkEdgeBuilder15CombineVerticalEPK14SkAnalyticEdgePS0_>
# dsb here doesn't help
1f6c36e: 2802 cmp r0, #2
1f6c370: d113 bne.n 1f6c39a <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2b2>
1f6c372: e746 b.n 1f6c202 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x11a>
1f6c374: f8da 000c ldr.w r0, [sl, #12]
1f6c378: b990 cbnz r0, 1f6c3a0 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2b8>
1f6c37a: f89a 0018 ldrb.w r0, [sl, #24]
1f6c37e: b978 cbnz r0, 1f6c3a0 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2b8>
1f6c380: 6ae0 ldr r0, [r4, #44] ; 0x2c
1f6c382: 9906 ldr r1, [sp, #24]
1f6c384: 4288 cmp r0, r1
1f6c386: d20b bcs.n 1f6c3a0 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x2b8>
1f6c388: 9806 ldr r0, [sp, #24]
1f6c38a: 4651 mov r1, sl
1f6c38c: f850 2c04 ldr.w r2, [r0, #-4]
# dsb here doesn't help
1f6c390: f7ff fad6 bl 1f6b940 <_ZN13SkEdgeBuilder15CombineVerticalEPK6SkEdgePS0_>
# dsb here doesn't help
1f6c394: 2802 cmp r0, #2
1f6c396: f43f af34 beq.w 1f6c202 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x11a>
1f6c39a: 2800 cmp r0, #0
# dsb here doesn't help
1f6c39c: f47f af42 bne.w 1f6c224 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x13c>
# dsb here--can't reproduce
1f6c3a0: 9804 ldr r0, [sp, #16]
1f6c3a2: eb0a 0b00 add.w fp, sl, r0
# dsb here--can't reproduce
1f6c3a6: 9806 ldr r0, [sp, #24]
1f6c3a8: f840 ab04 str.w sl, [r0], #4
1f6c3ac: 9006 str r0, [sp, #24]
1f6c3ae: f1b8 0801 subs.w r8, r8, #1
1f6c3b2: f109 0908 add.w r9, r9, #8
1f6c3b6: f47f af28 bne.w 1f6c20a <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0x122>
# dsb here--can't reproduce
1f6c3ba: e704 b.n 1f6c1c6 <_ZN13SkEdgeBuilder9buildPolyERK6SkPathPK7SkIRectib+0xde>
---
Test: look at the end of setLine for conditional instructions and don't use them
There end of setLine() looks like this:
1f6bee6: 4635 moveq r5, r6
1f6bee8: 2d00 cmp r5, #0
1f6beea: bf48 it mi
1f6beec: 426d negmi r5, r5
1f6beee: 45c1 cmp r9, r8
1f6bef0: f04f 0101 mov.w r1, #1
1f6bef4: 6265 str r5, [r4, #36] ; 0x24
1f6bef6: bfc8 it gt
1f6bef8: f04f 31ff movgt.w r1, #4294967295 ; 0xffffffff
1f6befc: 2001 movs r0, #1
1f6befe: f884 1037 strb.w r1, [r4, #55] ; 0x37
1f6bf02: 2100 movs r1, #0
1f6bf04: 86a1 strh r1, [r4, #52] ; 0x34
1f6bf06: b003 add sp, #12
1f6bf08: e8bd 8ff0 ldmia.w sp!, {r4, r5, r6, r7, r8, r9, sl, fp, pc}
By experimenting (and using the knowledge of chicken bit 11), I found that changing:
1f6bee8: 2d00 cmp r5, #0
1f6beea: bf48 it mi
1f6beec: 426d negmi r5, r5
to:
1f6bee8: 2d00 cmp r5, #0
1f6beea: d500 bpl.n 1f6beee
1f6beec: 426d negs r5, r5
...avoided the errata. From the information I received from ARM about chicken bit 11, that chicken bit definitely affects how the CPU processes conditional instructions like:
1f6beec: 426d negmi r5, r5
This experiment helps prove that the logic turned off by enabling chicken bit 11 is the logic that's somehow causing the deadlock.
=============
I'll post my raw notes and details on object code patching techniques shortly. I think I might be getting too big for one comment...
,
Jun 20 2017
General techniques for running tests:
===
0. Choose a patch to make, like
# Try adding a DSB here
1f6c348: f894 0034 ldrb.w r0, [r4, #52] ; 0x34
===
1. Make a file called something like "skiatest11-asm.S". It should contain this (and to answer your question, I have no idea why I need to subtract 1312 in the .org instructions--I just did it to make things work out):
.globl main
.thumb
.syntax unified
main:
bl totest
.org 0x1f6c348-1312
totest:
b victimstring
back:
nop
// 29 bytes
.org 0x1f755ec-1312
victimstring:
dsb
ldrb.w r0, [r4, #52]
b back
===
2. Compile and disassemble:
armv7a-cros-linux-gnueabi-gcc -march=armv7-a -mthumb -mfpu=neon \
/b/tip/tmp/skiatest11-asm.S\
-o /b/tip/tmp/skiatest11
armv7a-cros-linux-gnueabi-objdump -D \
/b/tip/tmp/skiatest11 \
| grep -B5 -A20 totest
===
3. Look at the output:
01f6c348 <totest>:
1f6c348: f009 b950 b.w 1f755ec <victimstring>
01f6c34c <back>:
1f6c34c: bf00 nop
...
01f755ec <victimstring>:
1f755ec: f3bf 8f4f dsb sy
1f755f0: f894 0034 ldrb.w r0, [r4, #52] ; 0x34
1f755f4: f7f6 beaa b.w 1f6c34c <back>
===
4. Write commands to do the object code patch and update to your host:
cp r/opt/google/chrome/chrome chrome-patched
/b/tip/scripts/set16.py chrome-patched 1f6c348 f009 b950
/b/tip/scripts/set16.py chrome-patched 1f755ec f3bf 8f4f
/b/tip/scripts/set16.py chrome-patched 1f755f0 f894 0034
/b/tip/scripts/set16.py chrome-patched 1f755f4 f7f6 beaa
ssh da3 'mount -oremount,rw /; stop ui' ; \
scp chrome-patched da3:/opt/google/chrome/chrome; \
ssh da3 'mount -oremount,ro /; start ui'
...note that "set16.py" is an ugly script I wrote to easily patch a binary (attached). The output of the above is:
0x1f6c348: 0xf894 => 0xf009
0x1f6c34a: 0x0034 => 0xb950
0x1f755ec: 0x6f4e => 0xf3bf
0x1f755ee: 0x2074 => 0x8f4f
0x1f755f0: 0x6c61 => 0xf894
0x1f755f2: 0x206c => 0x0034
0x1f755f4: 0x2059 => 0xf7f6
0x1f755f6: 0x6974 => 0xbeaa
===
5. Test things.
Originally I was doing this manually. I'm sure it can be fully automated, but I didn't. I did at least get partial automation. Here's what I did:
A) Login
B) Go to this youtube video: https://www.youtube.com/watch?v=NZlfxWMr7nc
C) Position cursor above the upper left corner of the video.
D) Login via serial console (ssh probably fine too).
E) Run these commands (get /var/click.txt from attachment here--I just recorded a click using evemu-record):
for dev in /dev/input/event*; do
if evemu-describe "${dev}" | grep -q 'Elan-Touchpad'; then
break;
fi
done
date
echo starting > /dev/kmsg
RANDOM=$$$(date +%s)
while true; do
evemu-play "${dev}" < /var/click.txt
sleep 0.$(($RANDOM % 10))
done
===
Usually a failure would happen within a minute or two (sometimes it would happen before youtube would even finish loading). If 5-10 minutes went by with no problem when we're in good shape.
NOTE: with the final solution, I let things run overnight and had no failures
===
I've attached disassembly of "setLine" and surrounding code/callers here, in case it's helpful to anyone.
===
I've also attached my raw notes here, which probably won't make much sense to anyone but me (mostly it's just commented out experiments in the "asm.S" files, but you never know. Some of the earlier notes were less well organized / scattered as I tried to figure out proper techniques and improved my patching tools...
,
Jun 20 2017
Here are patches to fix this. I'd like to give ARM a little more time to dig, but if for some reason we run into another instance of this errata we have these ready. remote: https://chromium-review.googlesource.com/541537 CHROMIUM: ARM: errata: add support for A12/A17 errata CR711784 remote: https://chromium-review.googlesource.com/541538 CHROMIUM: config: Turn on erratum CR711784 for rockchip CPUs --- Once we get those landed we can revert the "dsb" in skia and can remove the frame pointer workaround in Chrome.
,
Jul 10 2017
I'm marking the CLs ready. Latest update from ARM is: > There is still no direct evidence but your previous test results > (by removing the negmi instruction) do make us be inclined to > believe that this issue was caused by the optimization. If we happen to get extra information we can always revert and do something else, but right now I'm inclined to take this. According to ARM simulations we'll see a tiny bit of performance degradation, but correctness is more important. Details: * SPEC Int 2k was down by 0.71%. * Our emulation-based browser benchmark (very old Firefox cycling some canned pages) was down 0.56%. * Our newer Octane (d8) test was down 0.15%
,
Jul 10 2017
Oh, one more set of raw notes since ARM asked for some additional testing the other week.
,
Jul 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/08bfa67f9561019c7dba38642bee25f81f4da800 commit 08bfa67f9561019c7dba38642bee25f81f4da800 Author: Douglas Anderson <dianders@chromium.org> Date: Tue Jul 11 00:33:19 2017 CHROMIUM: ARM: errata: add support for A12/A17 errata CR711784 This adds a code for turning on chicken bit 11, which appears to avoid a potential CPU deadlock that could occur. The exact set of instruction needed to trigger this errata is not totaly known yet, but we have a relatively high level of confidence that the problem is fixed by setting chicken bit 11. All details are in http://crbug.com/711784 This errata does not yet have a name, so I have tagged it ERRATA_CR711784. It will be up to ARM to upstream this errata once they have fully investigated it. BUG= chromium:711784 TEST=youtube clicking, as described in the BUG TEST=for i in $(seq 0 3); do taskset -c $i cat /sys/*/*/*or_debug; done Change-Id: I0aeac31882c9b39407d8c96333a237797a9f7000 Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/541537 Reviewed-by: Sonny Rao <sonnyrao@chromium.org> [modify] https://crrev.com/08bfa67f9561019c7dba38642bee25f81f4da800/arch/arm/mach-rockchip/embedded/rk3288_resume.c [modify] https://crrev.com/08bfa67f9561019c7dba38642bee25f81f4da800/arch/arm/mm/proc-v7.S [modify] https://crrev.com/08bfa67f9561019c7dba38642bee25f81f4da800/arch/arm/Kconfig
,
Jul 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/993864715979bc98f88bb21230ee40f7b6ba36e0 commit 993864715979bc98f88bb21230ee40f7b6ba36e0 Author: Douglas Anderson <dianders@chromium.org> Date: Tue Jul 11 00:33:20 2017 CHROMIUM: config: Turn on erratum CR711784 for rockchip CPUs This erratum affects us. Turn it on. BUG= chromium:711784 TEST=youtube clicking, as described in the BUG TEST=for i in $(seq 0 3); do taskset -c $i cat /sys/*/*/*or_debug; done Change-Id: Icbd209b2a2527f905cfd3ac0aa098812d9684c18 Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/541538 Reviewed-by: Sonny Rao <sonnyrao@chromium.org> [modify] https://crrev.com/993864715979bc98f88bb21230ee40f7b6ba36e0/chromeos/config/armel/chromiumos-cygnus.flavour.config [modify] https://crrev.com/993864715979bc98f88bb21230ee40f7b6ba36e0/chromeos/config/armel/chromiumos-rockchip.flavour.config [modify] https://crrev.com/993864715979bc98f88bb21230ee40f7b6ba36e0/chromeos/config/armel/chromiumos-arm.flavour.config [modify] https://crrev.com/993864715979bc98f88bb21230ee40f7b6ba36e0/chromeos/config/armel/chromiumos-ipq806x.flavour.config
,
Jul 11 2017
I think we're as root caused and fixed as we're going to get. Crossing fingers that this is truly the right root cause (all evidence above makes us think that it is, but we have no standalone repro case). Filed bugs for future work: * bug #740805 : Undo skia "dsb" hack * bug #740806 : Revert any hacks to force "frame pointers" on for veyron-minnie
,
Jul 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5cddc222d7e76891b197e853d8d61a1f6843b24f commit 5cddc222d7e76891b197e853d8d61a1f6843b24f Author: Wez <wez@chromium.org> Date: Thu Jul 13 00:00:02 2017 Update the enable_frame_pointers logic for ChromeOS. Now that we have a patch for the ARM A12/A17 errata, it is safe to try building ARM Thumb without frame-pointers again, since we don't need them. We leave frame-pointers enabled in ChromeOS for 64-bit. The Windows case is tweaked to use current_cpu rather than target_cpu, and the Android case is explicitly gated on is_android. Bug: 740806 , 711784 Change-Id: I2fff46d1a3a8d78d12c389f95a20acf54013318d Reviewed-on: https://chromium-review.googlesource.com/567608 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Cr-Commit-Position: refs/heads/master@{#486165} [modify] https://crrev.com/5cddc222d7e76891b197e853d8d61a1f6843b24f/build/config/compiler/compiler.gni
,
Jan 22 2018
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by diand...@chromium.org
, Apr 14 2017