Issue metadata
Sign in to add a comment
|
"Renderer crash on login window" chrome crashes on CrOS devices |
||||||||||||||||||||||
Issue descriptionWe've got an experimental cros device tester up at https://luci-milo.appspot.com/p/chromium/builders/luci.chromium.ci/chromeos-kevin-rel-hw-tests?limit=200 It was passing the tast and sanity tests up until https://luci-milo.appspot.com/p/chromium/builders/luci.chromium.ci/chromeos-kevin-rel-hw-tests/7264 Something slipped in to start making everything fail. From the logs, it appears there's some crash occurring during login: https://isolateserver.appspot.com/browse?namespace=default-gzip&digest=446511bf0237ef5b2896614f5996e161c2f89bdc&as=ui.20181220-225253 There's nothing obviously culpable in the blamelist. But https://chromium-review.googlesource.com/c/chromiumos/chromite/+/1387171/ does show up. Maybe related? I'll see if I can't repro.
,
Dec 21
Could you please try applYing https://libassistant-internal-review.git.corp.google.com/c/standalone/src/+/72543 to your Simple CHrome checkout and see if that fixes the problem?
,
Dec 22
,
Dec 22
,
Dec 22
,
Dec 22
The bot in #0 builds a simplechrome version of chrome with a public chromium checkout. It doesn't have the sources of any repo in https://chrome-internal.googlesource.com/chrome/src-internal/+/master/DEPS, nor does it have any chromeos-side sources. (nothing was fetched w/ repo. it's all gclient) Consequently there's no libassistant sources to apply #2's patch into. As for reproing, the browser here was built by entering the cros-sdk for a public config of kevin: `cros chrome-sdk --nostart-goma --use-external-config --board=kevin` `autoninja -C out_kevin/Release chromiumos_preflight` Deploying it to the device and running a test fails as reported: `deploy_chrome --build-dir out_kevin/Release/` And on the device: `local_test_runner ui.ChromeLogin` Here's the contents of /etc/lsb-release for the devices in question: CHROMEOS_RELEASE_APPID={92A7272A-834A-47A3-9112-E8FD55831660} CHROMEOS_BOARD_APPID={92A7272A-834A-47A3-9112-E8FD55831660} CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1} DEVICETYPE=CHROMEBOOK CHROMEOS_RELEASE_BOARD=kevin GOOGLE_RELEASE=11327.0.0-rc2 CHROMEOS_DEVSERVER=http://swarm-cros-885.c.chromeos-bot.internal:8080 CHROMEOS_RELEASE_BUILDER_PATH=kevin-full/R73-11327.0.0-rc2 CHROMEOS_RELEASE_BUILD_NUMBER=11327 CHROMEOS_RELEASE_BRANCH_NUMBER=0 CHROMEOS_RELEASE_CHROME_MILESTONE=73 CHROMEOS_RELEASE_PATCH_NUMBER=0-rc2 CHROMEOS_RELEASE_TRACK=testimage-channel CHROMEOS_RELEASE_DESCRIPTION=11327.0.0-rc2 (Continuous Builder - Builder: N/A) kevin CHROMEOS_RELEASE_BUILD_TYPE=Continuous Builder - Builder: N/A CHROMEOS_RELEASE_NAME=Chromium OS CHROMEOS_RELEASE_VERSION=11327.0.0-rc2 CHROMEOS_AUSERVER=http://swarm-cros-885.c.chromeos-bot.internal:8080/update
,
Dec 22
ok, this is pretty strange. We saw this problem a couple of weeks ago and we fixed by disabling the thinlto cache. However, simplechrome should not be using thinlto. So, I don't understand why this bug shows up here. Assigning to gbiv since cmtice will be on vacation. gbiv@ please take a look. In the meantime, we are reverting the change to force enable lld for simple chrome.
,
Dec 22
and gbiv is not available until Jan 2nd ..
,
Jan 5
Started to look into this earlier today. This is unrelated to the previous crash-on-login issue, since this configuration enables neither ThinLTO nor CFI. The repro I have shows that the crash is in libgcc's unwinder, which we're invoking (indirectly) from here: https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/frame/frame.cc?type=cs&q=blink::frame::frame&sq=package:chromium&g=0&l=254 We're trying to access memory at 0xfffffff0, which seems a biiit off for 32-bit userspace code. :) Overall, looks to me like Chrome's in an otherwise-good state, so my bet is that this is going to take figuring out what lld's placing somewhere that libgcc isn't expecting it to be placed. Stay tuned.
,
Jan 7
any idea why this only shows up without chrome_internal? (is that the case?)
,
Jan 8
I can only make this reproduce when is_chrome_branded = false. Importantly, that bit, when true, makes us exclude unwinding information. Given that this crash is in the unwinder, the presence of unwinding bits causing this makes sense. :) Looks like both gdb and libgcc take issue with this particular stack trace, though: it ends in a call to Builtin_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit, which is hand-rolled asm that (AIUI) acts as a bridge from JS code to native. gdb throws its hands up and says "Backtrace stopped: previous frame inner to this frame (corrupt stack?)". libgcc, like said, just crashes. May/may not be related, but while I was poking through the code, I found that CrOS disables frame pointers exclusively on ARM due to a long-fixed bug: https://bugs.llvm.org/show_bug.cgi?id=18505 . May/may not be relevant here, but I was unable to get this to repro on amd64-generic, so... In any case, it's strange that gold does well here. Will continue to dig.
,
Jan 8
I think I have the cause nailed down; will start trying to figure out a reasonable fix tomorrow.
First, I'd like to note that building with `is_chrome_branded = false` and `strip`ing out .ARM.exidx and .ARM.extab makes the bug go away. This is because the unwinder appears to refuse to do anything when .ARM.exidx isn't present.
Second, to explain what's going wrong, I need to go into how unwinding in ARM works.
.ARM.exidx is basically a massive, sorted array of `struct { signed int rel_addr; unsigned int flags; };`. `rel_addr` "points" to a function that the current array entry describes, and `flags` describes how to unwind the function at `rel_addr` (if doing so is even possible).
The relevant exidx entry for a function is looked up using binary search. There can apparently be a 1:many mapping between .ARM.exidx entries and logical functions (the spec notes that "a smart linker may be able to group such [exidx] entries (for smaller runtime table size).")
As it happens, the exidx entry that we're picking up on for Builtin_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit is the exidx entry for C++ function that gets laid out *before* Builtin_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit. Namely, _ZN2v88internal8Snapshot19DefaultSnapshotBlobEv.
I'd like to emphasize *C++ function* above. This is because we have a large blob of back-to-back Builtin_ functions in our binary, which are literally just asm embedded in a generated .cc file, e.g. (different builtin, same idea)
__asm__(V8_ASM_LABEL("Builtins_AdaptorWithExitFrame"));
__asm__(
".octa 0xe24dd004e1a04080e2800005e591700f\n"
".octa 0xee0eca10e51acfc3e52d3004e92d0012\n"
".octa 0xe1a00000ea029fcce1a01002ed8dea03\n"
".octa 0xcccccccccccccccc0000000300000000\n"
);
Looking at the raw bits of exidx, it appears that lld makes no attempt to generate exidx entries for these Builtin_ functions. Gold, however, generates an exidx entry just at the end of _ZN2v88internal8Snapshot19DefaultSnapshotBlobEv that has special "don't try to unwind me," flags set. This exidx entry applies for the entirety of our massive block of Builtin_ functions, so gold doesn't have these unwinding issues.
I don't know if there's something in the code that's prompting gold to do this, or if gold has to go out of its way to try and detect this. Will dig deeper on that tomorrow. But we have a plausible explanation now, at least.
,
Jan 8
Don't know if its just me but I can repro this regardless of `is_chrome_branded = false` or `is_chrome_branded = true`. sdk kevin R73-11525.0.0 Chrome @ 7eef64dcd503
,
Jan 8
Correction: Chrome OS @ R73-11546.0.0 sorry
,
Jan 8
Apologies -- are you able to repro it if you have both `is_chrome_branded = true` and `is_official_build = true`? (For some reason, I thought the former required the latter, but apparently not. So you need both to flip the gn bit that appears to matter here: https://cs.chromium.org/chromium/src/build/config/compiler/compiler.gni?type=cs&l=87 )
,
Jan 8
Setting both `is_chrome_branded = true` and `is_official_build = true` works! Thanks a ton! (TIL too that former doesnt imply the latter).
,
Jan 10
Filed upstream bug: https://bugs.llvm.org/show_bug.cgi?id=40277 After discussing with llozano, we'd like to turn lld back on for official+branded builds, and will revert to gold for the others. Working on that for simplechrome and chromeos-base/chromeos-chrome now.
,
Jan 11
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/ee0c2f1e1a3e344c8d07abbee288cc91f6f869c0 commit ee0c2f1e1a3e344c8d07abbee288cc91f6f869c0 Author: George Burgess IV <gbiv@google.com> Date: Fri Jan 11 17:48:00 2019 SimpleChrome: use lld in branded+official builds We originally landed lld for all builds (I3e596f088dbf69c4288906fe375a1db3ddce7555). Sadly, it appears that lld has a bug in the unwinding information that it emits. Because official, branded builds of Chrome opt out of unwind tables, it should be safe to turn lld back on for those. BUG=chromium:917504 TEST=`cros chrome-sdk --nostart-goma --board=kevin --internal` has use_lld=true `cros chrome-sdk --nostart-goma --board=kevin` has use_lld=false Change-Id: I28a9c4ec24132d83c435ee71dcce517230444e68 Reviewed-on: https://chromium-review.googlesource.com/1403882 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Tested-by: George Burgess <gbiv@chromium.org> Reviewed-by: Luis Lozano <llozano@chromium.org> [modify] https://crrev.com/ee0c2f1e1a3e344c8d07abbee288cc91f6f869c0/cli/cros/cros_chrome_sdk.py
,
Jan 11
Issue 921209 has been merged into this issue.
,
Jan 12
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/ddd8ec4f14cde5669066cd3f4cee5ad44d4c8cbc commit ddd8ec4f14cde5669066cd3f4cee5ad44d4c8cbc Author: George Burgess IV <gbiv@google.com> Date: Sat Jan 12 22:04:59 2019 chromeos-chrome: disable lld in non-internal builds lld has a bug in how it emits unwinding info. Since Chrome sometimes tries to unwind itself (for `StackTrace`s), this causes Chrome to crash. chrome_internal builds explicitly disable the emission of said bugged unwinding info, so continuing to use lld for those should be safe. This is intended to be reverted when the relevant bug is fixed. BUG=chromium:917504 TEST=`emerge-kevin` with and without chrome_internal; args.gn looked reasonable, and the expected warning was emitted. emerge-kevin chromeos-base/chromeos-chrome succeeded with: - USE= - USE=chrome_internal - USE="gold -lld" Change-Id: I7958310330cd41ddd78b98da0c1f1f83e0af41bb Reviewed-on: https://chromium-review.googlesource.com/1405444 Commit-Ready: George Burgess <gbiv@chromium.org> Tested-by: George Burgess <gbiv@chromium.org> Reviewed-by: Manoj Gupta <manojgupta@chromium.org> [modify] https://crrev.com/ddd8ec4f14cde5669066cd3f4cee5ad44d4c8cbc/chromeos-base/chromeos-chrome/chromeos-chrome-9999.ebuild
,
Jan 16
(6 days ago)
This is still happening with R73-11580.0.0 and Chromium ToT
,
Jan 16
(6 days ago)
Looks like we're sourcing some set of base args from a chrome_internal USE'd environment (?) in some cases. So we need to go out of our way to turn use_lld off in non-`--internal` builds. https://chromium-review.googlesource.com/c/chromiumos/chromite/+/1415915/ Thanks for flagging that!
,
Jan 16
(6 days ago)
Chatted with llozano offline; he said that it can potentially take many days for the simplechrome environment file to reflect changes in Chrome's ebuild (like the fix in I7958310330cd41ddd78b98da0c1f1f83e0af41bb), which is probably why this buildbot is still trying to use lld. Rather than waiting for the environment file to pick up the ebuild change, we're going forward with the SimpleChrome fix. We plan to revert both when lld is working again anyway, so...
,
Jan 18
(5 days ago)
,
Jan 18
(5 days ago)
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3c031af060bbdfd54b16442b7891e7237768586a commit 3c031af060bbdfd54b16442b7891e7237768586a Author: Takuto Ikuta <tikuta@chromium.org> Date: Fri Jan 18 04:12:16 2019 build: Default use_lld to true on ChromeOS for x64 This is a partial reland of 12b1bbff0b65f333ea2fa270259412cacbe056c3 I confirmed current lld can link chrome with the same config of previously failed build. https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20ChromeOS/59619 Original change's description: > build: Default use_lld to true on ChromeOS. > > As of https://chromium-review.googlesource.com/c/1372768 ChromeOS > chrome is always linked with lld. Because ChromeOS's build system > always overrides the value of use_lld this change should have no > effect on official ChromeOS builds, but it should make ChromeOS on > Linux more consistent with "real" ChromeOS. > > Bug: 701659 > Change-Id: I447e1b2dfb6cc970f2c2dd64d396b5ba58bc2ef2 > Reviewed-on: https://chromium-review.googlesource.com/c/1377139 > Reviewed-by: Nico Weber <thakis@chromium.org> > Commit-Queue: Peter Collingbourne <pcc@chromium.org> > Cr-Commit-Position: refs/heads/master@{#616485} Bug: 701659 , 923168, 917504 Change-Id: Iedb48cfe5b3b9c189dcf1b4c04bea998ee009d2c Reviewed-on: https://chromium-review.googlesource.com/c/1419558 Auto-Submit: Takuto Ikuta <tikuta@chromium.org> Reviewed-by: George Burgess <gbiv@chromium.org> Commit-Queue: George Burgess <gbiv@chromium.org> Commit-Queue: Takuto Ikuta <tikuta@chromium.org> Cr-Commit-Position: refs/heads/master@{#624005} [modify] https://crrev.com/3c031af060bbdfd54b16442b7891e7237768586a/build/config/compiler/compiler.gni
,
Jan 18
(4 days ago)
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/0c311f11aef52f43a4bbed55d258d2901cd3d403 commit 0c311f11aef52f43a4bbed55d258d2901cd3d403 Author: George Burgess IV <gbiv@google.com> Date: Fri Jan 18 08:45:58 2019 SimpleChrome: forcibly turn lld off in external builds I28a9c4ec24132d83c435ee71dcce517230444e68 was apparently incomplete, since the simplechrome builder is still failing BUG=chromium:917504 TEST=Three (former two have no mention of use_lld): - cros chrome-sdk --board=kevin --nocolor --log-level=debug --nostart-goma --use-external-config - cros chrome-sdk --board=kevin --nocolor --log-level=debug --nostart-goma - cros chrome-sdk --board=kevin --nocolor --log-level=debug --nostart-goma --internal Change-Id: Ia2f944729da5e7de117d6a098825fd4fc467b427 Reviewed-on: https://chromium-review.googlesource.com/1415915 Commit-Ready: George Burgess <gbiv@chromium.org> Tested-by: George Burgess <gbiv@chromium.org> Reviewed-by: Luis Lozano <llozano@chromium.org> [modify] https://crrev.com/0c311f11aef52f43a4bbed55d258d2901cd3d403/cli/cros/cros_chrome_sdk.py
,
Jan 18
(4 days ago)
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/21092a421cede9e75d24044c840b99c2924d0c5a commit 21092a421cede9e75d24044c840b99c2924d0c5a Author: Takashi Sakamoto <tasak@google.com> Date: Fri Jan 18 09:41:10 2019 Revert "build: Default use_lld to true on ChromeOS for x64" This reverts commit 3c031af060bbdfd54b16442b7891e7237768586a. Reason for revert: suspect compile failure on chromium.chrome/Google Chrome ChromeOS https://logs.chromium.org/logs/chromium/bb/chromium.chrome/Google_Chrome_ChromeOS/60740/+/recipes/steps/compile/0/stdout ld.lld: error: found local symbol '__bss_start' in global part of symbol table in file libassistant.so ld.lld: error: found local symbol '_end' in global part of symbol table in file libassistant.so ld.lld: error: found local symbol '_edata' in global part of symbol table in file libassistant.so ld.lld: error: found local symbol '__stop_malloc_hook' in global part of symbol table in file libassistant.so ld.lld: error: found local symbol '__start_malloc_hook' in global part of symbol table in file libassistant.so clang: error: linker command failed with exit code 1 (use -v to see invocation) Original change's description: > build: Default use_lld to true on ChromeOS for x64 > > This is a partial reland of 12b1bbff0b65f333ea2fa270259412cacbe056c3 > > I confirmed current lld can link chrome with the same config of previously failed build. > https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20ChromeOS/59619 > > Original change's description: > > build: Default use_lld to true on ChromeOS. > > > > As of https://chromium-review.googlesource.com/c/1372768 ChromeOS > > chrome is always linked with lld. Because ChromeOS's build system > > always overrides the value of use_lld this change should have no > > effect on official ChromeOS builds, but it should make ChromeOS on > > Linux more consistent with "real" ChromeOS. > > > > Bug: 701659 > > Change-Id: I447e1b2dfb6cc970f2c2dd64d396b5ba58bc2ef2 > > Reviewed-on: https://chromium-review.googlesource.com/c/1377139 > > Reviewed-by: Nico Weber <thakis@chromium.org> > > Commit-Queue: Peter Collingbourne <pcc@chromium.org> > > Cr-Commit-Position: refs/heads/master@{#616485} > > Bug: 701659 , 923168, 917504 > Change-Id: Iedb48cfe5b3b9c189dcf1b4c04bea998ee009d2c > Reviewed-on: https://chromium-review.googlesource.com/c/1419558 > Auto-Submit: Takuto Ikuta <tikuta@chromium.org> > Reviewed-by: George Burgess <gbiv@chromium.org> > Commit-Queue: George Burgess <gbiv@chromium.org> > Commit-Queue: Takuto Ikuta <tikuta@chromium.org> > Cr-Commit-Position: refs/heads/master@{#624005} TBR=thakis@chromium.org,pcc@chromium.org,gbiv@chromium.org,tikuta@chromium.org Change-Id: Ieba8a5113b944ef03f954388ebb076ff546fdd08 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 701659 , 923168, 917504 Reviewed-on: https://chromium-review.googlesource.com/c/1420493 Reviewed-by: Takashi Sakamoto <tasak@google.com> Commit-Queue: Takashi Sakamoto <tasak@google.com> Cr-Commit-Position: refs/heads/master@{#624051} [modify] https://crrev.com/21092a421cede9e75d24044c840b99c2924d0c5a/build/config/compiler/compiler.gni
,
Jan 19
(4 days ago)
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/32a4d8d19437eff3748f84a61b98abb35f4b702b commit 32a4d8d19437eff3748f84a61b98abb35f4b702b Author: Manoj Gupta <manojgupta@google.com> Date: Sat Jan 19 04:05:24 2019 chromeos-chrome: Keep lld as linker for arm64. For arm64, gold can't link the chrome binary with thinlto+cfi. So keep lld as linker even for non-internal builds. BUG=chromium:917504 TEST=kevin64-full chrome builds. Change-Id: Ie82f7b3f20149194e1f19387461afc696e6a5da3 Reviewed-on: https://chromium-review.googlesource.com/1418550 Commit-Ready: Manoj Gupta <manojgupta@chromium.org> Tested-by: Manoj Gupta <manojgupta@chromium.org> Reviewed-by: George Burgess <gbiv@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/32a4d8d19437eff3748f84a61b98abb35f4b702b/chromeos-base/chromeos-chrome/chromeos-chrome-9999.ebuild |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by bpastene@chromium.org
, Dec 21