New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 715331 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Jun 2017
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocked on:
issue 732961

Blocking:
issue 711461



Sign in to add a comment

binutils-2.27: daisy/elm fail to boot

Project Member Reported by rahulchaudhry@chromium.org, Apr 25 2017

Issue description

Emerged binutils-2.27 in the chroot (cross-armv7a-cros-linux-gnueabi/binutils and cross-arm-none-eabi/binutils).
Built daisy image. Build was successful.

Flashed the image on a daisy device using "cros flash ${IP} ${image}".
Cros flash reported that the image was flashed successfully.

The device is up. It is responding to pings. I can ssh into it.
However, the display is stuck at the chrome logo. It never reached the login screen.
After a few minutes, the display went to sleep (while still at the chrome logo).
It cannot be woken up by keyboard or trackpad.
Ping and ssh are still working.


 
Summary: binutils-2.27: daisy/elm fail to boot (was: binutils-2.27: daisy fails to boot)
elm (arm64) shows the same symptoms as daisy (not surprisingly, since elm user-space is arm 32-bit, same as daisy).

Built an elm image with binutils-2.27. Build was successful.
Flashed the image on an elm device using "cros flash ${IP} ${image}".
Cros flash reported that the image was flashed successfully.

The device is up. It is responding to pings. I can ssh into it.
However, the display is stuck at the chrome logo. It never reached the login screen.
After a few minutes, the display went to sleep (while still at the chrome logo).
It cannot be woken up by keyboard or trackpad.
Ping and ssh are still working.

From /var/log/chrome/chrome on the daisy device:

[4891:4891:0517/111624.962689:FATAL:login_display_host_impl.cc(957)] Renderer crash on login window
#0 0x0000b3710c04 <unknown>
#1 0x0000b372160c <unknown>
#2 0x0000b2ccce9c <unknown>
#3 0x0000b2a27458 <unknown>
#4 0x0000b2993454 <unknown>
#5 0x0000b3783d00 <unknown>
#6 0x0000b3725dcc <unknown>
#7 0x0000b3726086 <unknown>
#8 0x0000b3726338 <unknown>
#9 0x0000b3727574 <unknown>
#10 0x0000b374168e <unknown>
#11 0x0000b3511f4a <unknown>
#12 0x0000b28189fc <unknown>
#13 0x0000b281a808 <unknown>
#14 0x0000b28158f6 <unknown>
#15 0x0000b34f76b4 <unknown>
#16 0x0000b350c382 <unknown>
#17 0x0000b34f6c2c <unknown>
#18 0x0000b250bd2e <unknown>
#19 0x0000b19b98b8 __libc_start_main


/var/log/chrome/chrome on the elm device has a similar crash:
[5592:5592:0517/112630.188076:FATAL:login_display_host_impl.cc(957)] Renderer crash on login window
#0 0x0000ad6e896c <unknown>
#1 0x0000ad6f8cb8 <unknown>
#2 0x0000accbfdda <unknown>
#3 0x0000aca1bea4 <unknown>
#4 0x0000ac9884be <unknown>
#5 0x0000ad759008 <unknown>
#6 0x0000ad6fd36e <unknown>
#7 0x0000ad6fd626 <unknown>
#8 0x0000ad6fd8d8 <unknown>
#9 0x0000ad6feae8 <unknown>
#10 0x0000ad717e8a <unknown>
#11 0x0000ad4ec206 <unknown>
#12 0x0000ac7fcd5c <unknown>
#13 0x0000ac7feb50 <unknown>
#14 0x0000ac7f9c5a <unknown>
#15 0x0000ad4d3ff4 <unknown>
#16 0x0000ad4e66b0 <unknown>
#17 0x0000ad4d3570 <unknown>
#18 0x0000ac4f1a32 <unknown>
#19 0x0000f3f798b8 __libc_start_main

With newer builds /var/log/chrome/chrome has this:

[5054:5054:1231/170147.690129:FATAL:image_skia.cc(427)] Check failed: storage_.get(). 

A quick look into the source shows that storage_ is declared as:

  scoped_refptr<internal::ImageSkiaStorage> storage_;


Also, looking inside dmesg for terminated processes:

$ dmesg | grep -w terminated
[    7.062688] init: daisydog main process (589) terminated with status 1
[    7.432251] init: imageloader main process (468) terminated with status 1
[   18.614957] init: ui main process (933) terminated with status 2
[   19.062097] init: imageloader main process (1476) terminated with status 1
[   25.185333] init: ui main process (1567) terminated with status 2
[   25.522056] init: imageloader main process (2022) terminated with status 1
[   31.586891] init: ui main process (2115) terminated with status 2
[   31.907092] init: imageloader main process (2571) terminated with status 1
[   39.175141] init: ui main process (2663) terminated with status 2
[   39.717079] init: imageloader main process (3503) terminated with status 1
[   45.740529] init: ui main process (3636) terminated with status 2
[   46.067092] init: imageloader main process (4104) terminated with status 1
[   52.346183] init: ui main process (4196) terminated with status 2
[   52.672149] init: imageloader main process (4657) terminated with status 1
[   58.506347] init: ui main process (4751) terminated with status 2

is image_skia.cc compiled into a shared library? 
it cannot be part of the chrome executable since you tried rebuilding that and it did not help...


Regarding with #2, can you try to disable ASLR and look up the addresses manually?  Just in case it is different from image_skia.cc ...

$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space


I don't think ASLR is causing any confusion here.

For one thing, the same location is printed every time: image_skia.cc(427)

And the FATAL message is coming from an assert in the code:

  CHECK(storage_.get());

storage_ is declared as a scoped_refptr<>, so it looks like it gets properly initialized in builds with binutils-2.25, but does not get initialized in builds with binutils-2.27 for some reason.

I was wondering if resolving the stack trace in #2 would bring some insights.  Probably we can learn how storage_ is initialized and compare the code (genearted by 2.25 v.s. 2.27) that initializes it.
Yes, symbolizing the stack trace from the crash will be helpful. However, note that the place of crash has changed from a few weeks ago.
In #2, it was login_display_host_impl.cc (consistently, across multiple boots).
In #3, it is in image_skia.cc (consistently, across multiple boots).
After another repo sync / chrome sync, the crash site might move somewhere else!!


I've done another experiment, where I did two builds in different chroots (both with the exact same repo state).
The first build was with current binutils-2.25, everything built locally with --nousepkg. The built image boots fine.
The second build was with new binutils-2.27, but no other change. Everything built locally with --nousepkg. The built image does not boot.

Then I swapped the /build/daisy/packages directories in the two chroots. i.e. prebuilt packages in first chroot were replaced with the ones built from binutils-2.27 and prebuilt packages in second chroot were replaced with the ones built from binutils-2.25.

I built two images again. Surprisingly, the image from the old chroot still boots fine, and the image from the new chroot still has this issue!!

This suggests that the root cause of this problem is not in the chrome package or any other package that gets emerged into the board root.
It's **somewhere else**.

One of the remaining possibilities is glibc, which is not emerged into the build root (it is copied instead). The binutils upgrade may be having some bad effect on how glibc or the dynamic linker itself is built, leading to the symptoms here.

I'm testing this theory right now.

I have confirmed that the issue is with glibc-2.23-r6

If I swap the "/var/lib/portage/pkgs/cross-armv7a-cros-linux-gnueabi/glibc-2.23-r6.tbz2" package between the two chroots and build the images again, their fates are reversed:
- The chroot which built the working image initially (with binutils 2.25) now builds an image that is unbootable.
- The chroot which built the unbootable image initially (with binutils 2.27) now builds an image that boots fine.

Blockedon: 732961
Status: Fixed (was: Assigned)
Fixed by https://android-review.googlesource.com/#/c/415207
Status: Verified (was: Fixed)
Closing. Please reopen it if its not fixed. Thanks!

Sign in to add a comment