New issue
Advanced search Search tips

Issue 917504 link

Starred by 4 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug
Build-Toolchain

Blocking:
issue 923168



Sign in to add a comment

"Renderer crash on login window" chrome crashes on CrOS devices

Project Member Reported by bpastene@chromium.org, Dec 21

Issue description

We've got an experimental cros device tester up at https://luci-milo.appspot.com/p/chromium/builders/luci.chromium.ci/chromeos-kevin-rel-hw-tests?limit=200

It was passing the tast and sanity tests up until https://luci-milo.appspot.com/p/chromium/builders/luci.chromium.ci/chromeos-kevin-rel-hw-tests/7264

Something slipped in to start making everything fail. From the logs, it appears there's some crash occurring during login:
https://isolateserver.appspot.com/browse?namespace=default-gzip&digest=446511bf0237ef5b2896614f5996e161c2f89bdc&as=ui.20181220-225253

There's nothing obviously culpable in the blamelist. But https://chromium-review.googlesource.com/c/chromiumos/chromite/+/1387171/ does show up. Maybe related? I'll see if I can't repro.
 
Owner: cmt...@chromium.org
Confirmed that reverting https://chromium-review.googlesource.com/c/chromiumos/chromite/+/1387171/ locally avoids the crash. cmtice@, could you take a look? It appears that using lld causes login crashes when running a simplechrome-built version of chrome on devices.
Could you please try applYing https://libassistant-internal-review.git.corp.google.com/c/standalone/src/+/72543 to your Simple CHrome checkout and see if that fixes the problem?
Cc: llozano@chromium.org
Components: Tools>ChromeOS-Toolchain
The bot in #0 builds a simplechrome version of chrome with a public chromium checkout. It doesn't have the sources of any repo in https://chrome-internal.googlesource.com/chrome/src-internal/+/master/DEPS, nor does it have any chromeos-side sources. (nothing was fetched w/ repo. it's all gclient) Consequently there's no libassistant sources to apply #2's patch into.

As for reproing, the browser here was built by entering the cros-sdk for a public config of kevin:
`cros chrome-sdk --nostart-goma --use-external-config --board=kevin`
`autoninja -C out_kevin/Release chromiumos_preflight`

Deploying it to the device and running a test fails as reported:
`deploy_chrome --build-dir out_kevin/Release/`
And on the device:
`local_test_runner ui.ChromeLogin`

Here's the contents of /etc/lsb-release for the devices in question:
CHROMEOS_RELEASE_APPID={92A7272A-834A-47A3-9112-E8FD55831660}
CHROMEOS_BOARD_APPID={92A7272A-834A-47A3-9112-E8FD55831660}
CHROMEOS_CANARY_APPID={90F229CE-83E2-4FAF-8479-E368A34938B1}
DEVICETYPE=CHROMEBOOK
CHROMEOS_RELEASE_BOARD=kevin
GOOGLE_RELEASE=11327.0.0-rc2
CHROMEOS_DEVSERVER=http://swarm-cros-885.c.chromeos-bot.internal:8080
CHROMEOS_RELEASE_BUILDER_PATH=kevin-full/R73-11327.0.0-rc2
CHROMEOS_RELEASE_BUILD_NUMBER=11327
CHROMEOS_RELEASE_BRANCH_NUMBER=0
CHROMEOS_RELEASE_CHROME_MILESTONE=73
CHROMEOS_RELEASE_PATCH_NUMBER=0-rc2
CHROMEOS_RELEASE_TRACK=testimage-channel
CHROMEOS_RELEASE_DESCRIPTION=11327.0.0-rc2 (Continuous Builder - Builder: N/A) kevin
CHROMEOS_RELEASE_BUILD_TYPE=Continuous Builder - Builder: N/A
CHROMEOS_RELEASE_NAME=Chromium OS
CHROMEOS_RELEASE_VERSION=11327.0.0-rc2
CHROMEOS_AUSERVER=http://swarm-cros-885.c.chromeos-bot.internal:8080/update
Owner: g...@chromium.org
ok, this is pretty strange. 
We saw this problem a couple of weeks ago and we fixed by disabling the thinlto cache. 
However, simplechrome should not be using thinlto. So, I don't understand why this bug shows up here.
Assigning to gbiv since cmtice will be on vacation. 
gbiv@ please take a look. 
In the meantime, we are reverting the change to force enable lld for simple chrome. 
and gbiv is not available until Jan 2nd .. 
Started to look into this earlier today. This is unrelated to the previous crash-on-login issue, since this configuration enables neither ThinLTO nor CFI. The repro I have shows that the crash is in libgcc's unwinder, which we're invoking (indirectly) from here: https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/frame/frame.cc?type=cs&q=blink::frame::frame&sq=package:chromium&g=0&l=254

We're trying to access memory at 0xfffffff0, which seems a biiit off for 32-bit userspace code. :)

Overall, looks to me like Chrome's in an otherwise-good state, so my bet is that this is going to take figuring out what lld's placing somewhere that libgcc isn't expecting it to be placed. Stay tuned.
any idea why this only shows up without chrome_internal? (is that the case?)
I can only make this reproduce when is_chrome_branded = false. Importantly, that bit, when true, makes us exclude unwinding information. Given that this crash is in the unwinder, the presence of unwinding bits causing this makes sense. :)

Looks like both gdb and libgcc take issue with this particular stack trace, though: it ends in a call to Builtin_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit, which is hand-rolled asm that (AIUI) acts as a bridge from JS code to native. gdb throws its hands up and says "Backtrace stopped: previous frame inner to this frame (corrupt stack?)". libgcc, like said, just crashes.

May/may not be related, but while I was poking through the code, I found that CrOS disables frame pointers exclusively on ARM due to a long-fixed bug: https://bugs.llvm.org/show_bug.cgi?id=18505 . May/may not be relevant here, but I was unable to get this to repro on amd64-generic, so...

In any case, it's strange that gold does well here. Will continue to dig.
I think I have the cause nailed down; will start trying to figure out a reasonable fix tomorrow.

First, I'd like to note that building with `is_chrome_branded = false` and `strip`ing out .ARM.exidx and .ARM.extab makes the bug go away. This is because the unwinder appears to refuse to do anything when .ARM.exidx isn't present.

Second, to explain what's going wrong, I need to go into how unwinding in ARM works.

.ARM.exidx is basically a massive, sorted array of `struct { signed int rel_addr; unsigned int flags; };`. `rel_addr` "points" to a function that the current array entry describes, and `flags` describes how to unwind the function at `rel_addr` (if doing so is even possible).

The relevant exidx entry for a function is looked up using binary search. There can apparently be a 1:many mapping between .ARM.exidx entries and logical functions (the spec notes that "a smart linker may be able to group such [exidx] entries (for smaller runtime table size).")

As it happens, the exidx entry that we're picking up on for Builtin_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit is the exidx entry for C++ function that gets laid out *before* Builtin_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit. Namely, _ZN2v88internal8Snapshot19DefaultSnapshotBlobEv.

I'd like to emphasize *C++ function* above. This is because we have a large blob of back-to-back Builtin_ functions in our binary, which are literally just asm embedded in a generated .cc file, e.g. (different builtin, same idea)

  __asm__(V8_ASM_LABEL("Builtins_AdaptorWithExitFrame"));
  __asm__(
    ".octa 0xe24dd004e1a04080e2800005e591700f\n"
    ".octa 0xee0eca10e51acfc3e52d3004e92d0012\n"
    ".octa 0xe1a00000ea029fcce1a01002ed8dea03\n"
    ".octa 0xcccccccccccccccc0000000300000000\n"
  );

Looking at the raw bits of exidx, it appears that lld makes no attempt to generate exidx entries for these Builtin_ functions. Gold, however, generates an exidx entry just at the end of _ZN2v88internal8Snapshot19DefaultSnapshotBlobEv that has special "don't try to unwind me," flags set. This exidx entry applies for the entirety of our massive block of Builtin_ functions, so gold doesn't have these unwinding issues.

I don't know if there's something in the code that's prompting gold to do this, or if gold has to go out of its way to try and detect this. Will dig deeper on that tomorrow. But we have a plausible explanation now, at least.
Don't know if its just me but I can repro this regardless of `is_chrome_branded = false` or `is_chrome_branded = true`.

sdk kevin R73-11525.0.0
Chrome @ 7eef64dcd503

Correction: Chrome OS @ R73-11546.0.0 sorry
Apologies -- are you able to repro it if you have both `is_chrome_branded = true` and `is_official_build = true`? (For some reason, I thought the former required the latter, but apparently not. So you need both to flip the gn bit that appears to matter here: https://cs.chromium.org/chromium/src/build/config/compiler/compiler.gni?type=cs&l=87 )
Setting both `is_chrome_branded = true` and `is_official_build = true` works! Thanks a ton! (TIL too that former doesnt imply the latter).
Filed upstream bug: https://bugs.llvm.org/show_bug.cgi?id=40277

After discussing with llozano, we'd like to turn lld back on for official+branded builds, and will revert to gold for the others. Working on that for simplechrome and chromeos-base/chromeos-chrome now.
Project Member

Comment 18 by bugdroid1@chromium.org, Jan 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/ee0c2f1e1a3e344c8d07abbee288cc91f6f869c0

commit ee0c2f1e1a3e344c8d07abbee288cc91f6f869c0
Author: George Burgess IV <gbiv@google.com>
Date: Fri Jan 11 17:48:00 2019

SimpleChrome: use lld in branded+official builds

We originally landed lld for all builds
(I3e596f088dbf69c4288906fe375a1db3ddce7555). Sadly, it appears that lld
has a bug in the unwinding information that it emits.

Because official, branded builds of Chrome opt out of unwind tables, it
should be safe to turn lld back on for those.

BUG=chromium:917504
TEST=`cros chrome-sdk --nostart-goma --board=kevin --internal` has
      use_lld=true
     `cros chrome-sdk --nostart-goma --board=kevin` has use_lld=false

Change-Id: I28a9c4ec24132d83c435ee71dcce517230444e68
Reviewed-on: https://chromium-review.googlesource.com/1403882
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: George Burgess <gbiv@chromium.org>
Reviewed-by: Luis Lozano <llozano@chromium.org>

[modify] https://crrev.com/ee0c2f1e1a3e344c8d07abbee288cc91f6f869c0/cli/cros/cros_chrome_sdk.py

Cc: bpastene@chromium.org alemate@chromium.org goog...@chromium.org
 Issue 921209  has been merged into this issue.
Project Member

Comment 20 by bugdroid1@chromium.org, Jan 12

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/ddd8ec4f14cde5669066cd3f4cee5ad44d4c8cbc

commit ddd8ec4f14cde5669066cd3f4cee5ad44d4c8cbc
Author: George Burgess IV <gbiv@google.com>
Date: Sat Jan 12 22:04:59 2019

chromeos-chrome: disable lld in non-internal builds

lld has a bug in how it emits unwinding info. Since Chrome sometimes
tries to unwind itself (for `StackTrace`s), this causes Chrome to crash.
chrome_internal builds explicitly disable the emission of said bugged
unwinding info, so continuing to use lld for those should be safe.

This is intended to be reverted when the relevant bug is fixed.

BUG=chromium:917504
TEST=`emerge-kevin` with and without chrome_internal; args.gn looked
     reasonable, and the expected warning was emitted.
     emerge-kevin chromeos-base/chromeos-chrome succeeded with:
     	- USE=
	- USE=chrome_internal
	- USE="gold -lld"

Change-Id: I7958310330cd41ddd78b98da0c1f1f83e0af41bb
Reviewed-on: https://chromium-review.googlesource.com/1405444
Commit-Ready: George Burgess <gbiv@chromium.org>
Tested-by: George Burgess <gbiv@chromium.org>
Reviewed-by: Manoj Gupta <manojgupta@chromium.org>

[modify] https://crrev.com/ddd8ec4f14cde5669066cd3f4cee5ad44d4c8cbc/chromeos-base/chromeos-chrome/chromeos-chrome-9999.ebuild

Comment 21 by sahel@chromium.org, Jan 16 (6 days ago)

This is still happening with R73-11580.0.0 and Chromium ToT

Comment 22 by g...@chromium.org, Jan 16 (6 days ago)

Looks like we're sourcing some set of base args from a chrome_internal USE'd environment (?) in some cases. So we need to go out of our way to turn use_lld off in non-`--internal` builds.

https://chromium-review.googlesource.com/c/chromiumos/chromite/+/1415915/

Thanks for flagging that!

Comment 23 by g...@chromium.org, Jan 16 (6 days ago)

Chatted with llozano offline; he said that it can potentially take many days for the simplechrome environment file to reflect changes in Chrome's ebuild (like the fix in I7958310330cd41ddd78b98da0c1f1f83e0af41bb), which is probably why this buildbot is still trying to use lld.

Rather than waiting for the environment file to pick up the ebuild change, we're going forward with the SimpleChrome fix. We plan to revert both when lld is working again anyway, so...

Comment 24 by achuith@chromium.org, Jan 18 (5 days ago)

Blocking: 923168
Project Member

Comment 25 by bugdroid1@chromium.org, Jan 18 (5 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3c031af060bbdfd54b16442b7891e7237768586a

commit 3c031af060bbdfd54b16442b7891e7237768586a
Author: Takuto Ikuta <tikuta@chromium.org>
Date: Fri Jan 18 04:12:16 2019

build: Default use_lld to true on ChromeOS for x64

This is a partial reland of 12b1bbff0b65f333ea2fa270259412cacbe056c3

I confirmed current lld can link chrome with the same config of previously failed build.
https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20ChromeOS/59619

Original change's description:
> build: Default use_lld to true on ChromeOS.
>
> As of https://chromium-review.googlesource.com/c/1372768 ChromeOS
> chrome is always linked with lld. Because ChromeOS's build system
> always overrides the value of use_lld this change should have no
> effect on official ChromeOS builds, but it should make ChromeOS on
> Linux more consistent with "real" ChromeOS.
>
> Bug:  701659 
> Change-Id: I447e1b2dfb6cc970f2c2dd64d396b5ba58bc2ef2
> Reviewed-on: https://chromium-review.googlesource.com/c/1377139
> Reviewed-by: Nico Weber <thakis@chromium.org>
> Commit-Queue: Peter Collingbourne <pcc@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#616485}

Bug:  701659 , 923168, 917504
Change-Id: Iedb48cfe5b3b9c189dcf1b4c04bea998ee009d2c
Reviewed-on: https://chromium-review.googlesource.com/c/1419558
Auto-Submit: Takuto Ikuta <tikuta@chromium.org>
Reviewed-by: George Burgess <gbiv@chromium.org>
Commit-Queue: George Burgess <gbiv@chromium.org>
Commit-Queue: Takuto Ikuta <tikuta@chromium.org>
Cr-Commit-Position: refs/heads/master@{#624005}
[modify] https://crrev.com/3c031af060bbdfd54b16442b7891e7237768586a/build/config/compiler/compiler.gni

Project Member

Comment 26 by bugdroid1@chromium.org, Jan 18 (4 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/0c311f11aef52f43a4bbed55d258d2901cd3d403

commit 0c311f11aef52f43a4bbed55d258d2901cd3d403
Author: George Burgess IV <gbiv@google.com>
Date: Fri Jan 18 08:45:58 2019

SimpleChrome: forcibly turn lld off in external builds

I28a9c4ec24132d83c435ee71dcce517230444e68 was apparently incomplete,
since the simplechrome builder is still failing

BUG=chromium:917504
TEST=Three (former two have no mention of use_lld):
     - cros chrome-sdk --board=kevin --nocolor
       --log-level=debug --nostart-goma --use-external-config
     - cros chrome-sdk --board=kevin --nocolor
       --log-level=debug --nostart-goma
     - cros chrome-sdk --board=kevin --nocolor
       --log-level=debug --nostart-goma --internal

Change-Id: Ia2f944729da5e7de117d6a098825fd4fc467b427
Reviewed-on: https://chromium-review.googlesource.com/1415915
Commit-Ready: George Burgess <gbiv@chromium.org>
Tested-by: George Burgess <gbiv@chromium.org>
Reviewed-by: Luis Lozano <llozano@chromium.org>

[modify] https://crrev.com/0c311f11aef52f43a4bbed55d258d2901cd3d403/cli/cros/cros_chrome_sdk.py

Project Member

Comment 27 by bugdroid1@chromium.org, Jan 18 (4 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/21092a421cede9e75d24044c840b99c2924d0c5a

commit 21092a421cede9e75d24044c840b99c2924d0c5a
Author: Takashi Sakamoto <tasak@google.com>
Date: Fri Jan 18 09:41:10 2019

Revert "build: Default use_lld to true on ChromeOS for x64"

This reverts commit 3c031af060bbdfd54b16442b7891e7237768586a.

Reason for revert: suspect compile failure on chromium.chrome/Google Chrome ChromeOS
https://logs.chromium.org/logs/chromium/bb/chromium.chrome/Google_Chrome_ChromeOS/60740/+/recipes/steps/compile/0/stdout

ld.lld: error: found local symbol '__bss_start' in global part of symbol table in file libassistant.so
ld.lld: error: found local symbol '_end' in global part of symbol table in file libassistant.so
ld.lld: error: found local symbol '_edata' in global part of symbol table in file libassistant.so
ld.lld: error: found local symbol '__stop_malloc_hook' in global part of symbol table in file libassistant.so
ld.lld: error: found local symbol '__start_malloc_hook' in global part of symbol table in file libassistant.so
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Original change's description:
> build: Default use_lld to true on ChromeOS for x64
> 
> This is a partial reland of 12b1bbff0b65f333ea2fa270259412cacbe056c3
> 
> I confirmed current lld can link chrome with the same config of previously failed build.
> https://ci.chromium.org/buildbot/chromium.chrome/Google%20Chrome%20ChromeOS/59619
> 
> Original change's description:
> > build: Default use_lld to true on ChromeOS.
> >
> > As of https://chromium-review.googlesource.com/c/1372768 ChromeOS
> > chrome is always linked with lld. Because ChromeOS's build system
> > always overrides the value of use_lld this change should have no
> > effect on official ChromeOS builds, but it should make ChromeOS on
> > Linux more consistent with "real" ChromeOS.
> >
> > Bug:  701659 
> > Change-Id: I447e1b2dfb6cc970f2c2dd64d396b5ba58bc2ef2
> > Reviewed-on: https://chromium-review.googlesource.com/c/1377139
> > Reviewed-by: Nico Weber <thakis@chromium.org>
> > Commit-Queue: Peter Collingbourne <pcc@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#616485}
> 
> Bug:  701659 , 923168, 917504
> Change-Id: Iedb48cfe5b3b9c189dcf1b4c04bea998ee009d2c
> Reviewed-on: https://chromium-review.googlesource.com/c/1419558
> Auto-Submit: Takuto Ikuta <tikuta@chromium.org>
> Reviewed-by: George Burgess <gbiv@chromium.org>
> Commit-Queue: George Burgess <gbiv@chromium.org>
> Commit-Queue: Takuto Ikuta <tikuta@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#624005}

TBR=thakis@chromium.org,pcc@chromium.org,gbiv@chromium.org,tikuta@chromium.org

Change-Id: Ieba8a5113b944ef03f954388ebb076ff546fdd08
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  701659 , 923168, 917504
Reviewed-on: https://chromium-review.googlesource.com/c/1420493
Reviewed-by: Takashi Sakamoto <tasak@google.com>
Commit-Queue: Takashi Sakamoto <tasak@google.com>
Cr-Commit-Position: refs/heads/master@{#624051}
[modify] https://crrev.com/21092a421cede9e75d24044c840b99c2924d0c5a/build/config/compiler/compiler.gni

Project Member

Comment 28 by bugdroid1@chromium.org, Jan 19 (4 days ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/32a4d8d19437eff3748f84a61b98abb35f4b702b

commit 32a4d8d19437eff3748f84a61b98abb35f4b702b
Author: Manoj Gupta <manojgupta@google.com>
Date: Sat Jan 19 04:05:24 2019

chromeos-chrome: Keep lld as linker for arm64.

For arm64, gold can't link the chrome binary with thinlto+cfi.
So keep lld as linker even for non-internal builds.

BUG=chromium:917504
TEST=kevin64-full chrome builds.

Change-Id: Ie82f7b3f20149194e1f19387461afc696e6a5da3
Reviewed-on: https://chromium-review.googlesource.com/1418550
Commit-Ready: Manoj Gupta <manojgupta@chromium.org>
Tested-by: Manoj Gupta <manojgupta@chromium.org>
Reviewed-by: George Burgess <gbiv@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/32a4d8d19437eff3748f84a61b98abb35f4b702b/chromeos-base/chromeos-chrome/chromeos-chrome-9999.ebuild

Sign in to add a comment