New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 617557 link

Starred by 3 users

Issue metadata

Status: Verified
Owner:
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Celes keep reboot w/ latest v3.18 kernel at Jun. 6th. (tip commit: 860bb18c8ec3)

Reported by gs0...@gmail.com, Jun 6 2016

Issue description

UserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.63 Safari/537.36
Platform: Celes

Steps to reproduce the problem:
1. repo sync to latest source of Jun. 6th
2. Build kernel, e.g. emerge-celes sys-kernel/chromeos-kernel-3_18
3. Update kernel, e.g. ./update_kernel.sh --remote=10.5.232.32

What is the expected behavior?
Stable boot to login window

What went wrong?
Device keep rebooting, potentially hit kernel panic

Did this work before? Yes commit ef45d91ecae6 2016-06-02 Jaiganesh Narayanan (b1) CHROMIMUM: qcom: rng: enable rng hardware

Chrome version: 49.0.2623.63  Channel: n/a
OS Version: 3.18.0
Flash Version: Shockwave Flash 21.0 r0

Had naive dichotomy, looks to me problematic commit landed in Jun. 3rd.

Bad in tip commit 860bb18c8ec3,
Good in Jun. 2nd commit of ef45d91ecae6,
Bad in Jun 4th. commit of 6be651d49e7a 

Failed on Celes (BSW), but not hit rebooting when I used Skylake chromebook.
 

Comment 1 by gs0...@gmail.com, Jun 6 2016

It looks me same problem if I build whole image into USB key and boots Celes w/ it. i.e. /build_image --board=${BOARD} --noenable_rootfs_verification test

 

Comment 2 by gs0...@gmail.com, Jun 6 2016

Rollbacked to safe commit, rebuilt it w/ tty enabling options, updated and capture UART logs as enclosed, observed several rounds, it looked to me consistant hitting crash pattern as down below example:

[    9.269364] BUG: unable to handle kernel paging request at ffffffffc06d4def
[    9.272681] IP: [<ffffffffba023bf4>] strcpy+0xc/0x18
[    9.281387] PGD 3aa1b067 PUD 3aa1d067 PMD 604b2067 PTE 800000005f20e161
[    9.288031] Oops: 0003 [#1] PREEMPT SMP 
[    9.288031] gsmi: Log Shutdown Reason 0x03
<snip>
[    9.341856] CPU: 1 PID: 132 Comm: udevd Tainted: G        W      3.18.0 #5
[    9.341856] Hardware name: GOOGLE Celes, BIOS Google_Celes.7287.92.34 05/24/2016
<snip>
[    9.488223] Call Trace:
[    9.488223]  [<ffffffffc06d3389>] init_module+0x2e0389/0x2e03e5 [snd_soc_sst_cht_bsw_rt5645]
[    9.488223]  [<ffffffffba1aa02a>] platform_drv_probe+0x4b/0x91
[    9.488223]  [<ffffffffba1a55fa>] ? devices_kset_move_last+0x60/0x64
[    9.488223]  [<ffffffffba1a86ec>] driver_probe_device+0x109/0x2b1
[    9.488223]  [<ffffffffba1a895a>] __driver_attach+0x5e/0x81
[    9.488223]  [<ffffffffba1a88fc>] ? __device_attach_driver+0x68/0x68
[    9.488223]  [<ffffffffba1a77bd>] bus_for_each_dev+0x8c/0xaf
[    9.488223]  [<ffffffffba1a817a>] driver_attach+0x1e/0x20
[    9.488223]  [<ffffffffba1a7e02>] bus_add_driver+0xeb/0x1e3
[    9.488223]  [<ffffffffba1a913d>] driver_register+0x8f/0xcc
[    9.488223]  [<ffffffffc03f3000>] ? 0xffffffffc03f3000
[    9.488223]  [<ffffffffba1a9fa4>] __platform_driver_register+0x4a/0x4c
[    9.488223]  [<ffffffffc03f3017>] init_module+0x17/0x1000 [snd_soc_sst_cht_bsw_rt5645]
[    9.488223]  [<ffffffffb9e003b5>] do_one_initcall+0x188/0x19d
[    9.488223]  [<ffffffffb9f091ae>] ? __vunmap+0xac/0xb7
[    9.488223]  [<ffffffffb9ea0481>] load_module+0x15e4/0x1bb4
[    9.488223]  [<ffffffffb9ea0bd2>] SyS_finit_module+0x86/0xab
[    9.488223]  [<ffffffffba421edc>] system_call_fastpath+0x1c/0x21
[    9.488223] Code: 7b ba 31 c0 e8 7a 84 3f 00 48 89 de 48 c7 c7 3a 76 79 ba 31 c0 e8 69 84 3f 00 5b 41 5c 5d c3 55 48 89 f8 31 d2 48 89 e5 8a 0c 16 <88> 0c 10 48 ff c2 84 c9 75 f3 5d c3 55 48 89 f8 31 c9 48 89 e5 
[    9.488223] RIP  [<ffffffffba023bf4>] strcpy+0xc/0x18
[    9.488223]  RSP <ffff8800788ffb98>
[    9.488223] CR2: ffffffffc06d4def
[    9.488223] ---[ end trace 0c090b7b0e56978c ]---
[    9.488223] Kernel panic - not syncing: Fatal exception

minicom.hp.log
155 KB View Download
FYI, tried to review my original report.
Downloaded three Celes CPFE images as USB key, I was still able to hit rebooting issue:

R53-8415.0.0  2016-06-05
R53-8418.0.0  2016-06-06
R53-8419.0.0  2016-06-06


Cc: sha...@chromium.org
Components: OS>Kernel
Labels: -Pri-2 Pri-1
This is also reproducible on edgar, and possibly all braswell devices. Probably working in 8406.0.0, failing on 8409.0.0.

https://crosland.corp.google.com/log/8406.0.0..8409.0.0
Cc: mshe...@chromium.org xixuan@chromium.org rohi...@chromium.org haddowk@chromium.org vpalatin@chromium.org
 Issue 618020  has been merged into this issue.
Cc: rajatja@google.com
This CL is almost certainly the cause:

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/9e0680b08d99a49af9ce5238cacfdefd4f57d80f

Rajat, can you please take a look?
Owner: rajatja@chromium.org
Status: Assigned (was: Unconfirmed)
Owner: bleung@chromium.org
Rajat is on paternity leave.  Benson, can you PTAL?
Status: Started (was: Assigned)
Sure. Since this is a critical issue across all Braswell systems, let's just revert the patch in question to get Braswell healthy and then we'll take a closer look at how to more carefully enable this to support BYT systems coming to 3.18?

Revert here : https://chromium-review.googlesource.com/351113

Looking for a reviewer.

Comment 10 by gs0...@gmail.com, Jun 9 2016

I have verified Benson's revert that it does work on my Celes.

Meanwhile, w/ the callstack I also located potential fix as salvation.
Since I have Celes only, perhaps chromium-dev could review/evaluate my cherry-pick'ed patch on all BSW, then re-submit Rajat's one?

https://chromium-review.googlesource.com/#/c/351150
Cc: bhthompson@chromium.org dgreid@chromium.org
I don't have any Braswell based systems with me, but adding dylan and bernie, who may be able to help verify on thursday Mountain View time.

The patch looks good to me though. We haven't landed the revert yet, and if this one looks good we can skip revert and just land this fix instead.
I guess since Cyan uses a different audio codec it did not fail on this one, I can confirm an Ultima fails locally on a canary build, at this point it looks like the CL is in the CQ and comment #10 indicates this is fixed with the revert so I think we are good to go, we can verify the build after this lands.
Status: Fixed (was: Started)
Didn't see a commit message here, but Harry's CL was merged : 

https://chromium-review.googlesource.com/#/c/351150

Closing. Let me know if we see something like this again after 8434.0.0.
Status: Verified (was: Fixed)
Bulk verified
Cc: -mshe...@chromium.org

Sign in to add a comment