New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 640649 link

Starred by 16 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

One very kernel panic happy Acer Chromebook 14 (Edgar)

Reported by willg...@gmail.com, Aug 24 2016

Issue description

UserAgent: Mozilla/5.0 (X11; CrOS x86_64 8731.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2831.0 Safari/537.36
Platform: Platform 8731.0.0 (Official Build) canary-channel edgar

Steps to reproduce the problem:
Does not appear to have repro steps

What is the expected behavior?
That OS is stable

What went wrong?
Crashes to black screen and reboots

Did this work before? N/A 

Chrome version: 54.0.2831.0  Channel: n/a
OS Version: 8731.0.0
Flash Version: Shockwave Flash 23.0 r0

Happens in version 54.0.2831.0-54.0.2824.5

See attached text file for the recorded kernel crashes.
 
kernel-crashes.txt
255 KB View Download

Comment 1 by willg...@gmail.com, Aug 30 2016

Still happens in:

Version 55.0.2844.0 canary (64-bit)
Platform 8755.0.0 (Official Build) canary-channel edgar
Firmware Google_Edgar.7287.167.17

Added my event log data. It seems to happen shortly after (but not always) waking up from suspend?
event_log.txt
15.3 KB View Download
Components: OS>Kernel

Comment 3 Deleted

Comment 4 by willg...@gmail.com, Sep 12 2016

Still happening and this time a bit different. I closed the lid to sleep and it rebooted when I opened the lid instead of resuming from suspend.
latest_kernel_panic
255 KB View Download
latestest_event_log
14.7 KB View Download

Comment 5 by willg...@gmail.com, Sep 16 2016

This issue has now made it into the Beta channel :( 

Not adding logs because they're the same as before.

Comment 6 by willg...@gmail.com, Oct 13 2016

This is an extremely severe issue. Is this going to be addressed anytime soon?
Cc: puneetster@chromium.org snanda@chromium.org
Labels: -Pri-2 M-55 ReleaseBlock-Stable Pri-1
Is this only affecting a single device? 

Comment 8 by willg...@gmail.com, Oct 13 2016

Minnie just fell asleep and rebooted too a moment ago on 55.0.2882.0

Nothing showed up in the event log, kernel_crashes or chrome://crashes.

The Edgar does it repeatedly since 54 was in Canary. Only the current stable (53) is unaffected. 

Comment 9 by willg...@gmail.com, Oct 13 2016

Here's a current log on:

Version 55.0.2883.7 dev (64-bit)
Platform 8872.6.2 (Official Build) dev-channel edgar
ARC Version 3337798
Firmware Google_Edgar.7287.167.36



event_log.txt
14.7 KB View Download

Comment 10 Deleted

Components: -OS>Kernel OS>Kernel>Graphics
Owner: marc...@chromium.org
+marcheu since the crashes look gfx related.

<1>[ 8422.280081] BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0
<1>[ 8422.280089] IP: [<ffffffff829aa84f>] intel_fbdev_set_suspend+0x9a/0xcf
<4>[ 8422.280091] PGD 0
<4>[ 8422.280094] Oops: 0000 [#1] PREEMPT SMP
<0>[ 8422.283607] gsmi: Log Shutdown Reason 0x03
<4>[ 8422.283633] Modules linked in: ccm evdi uinput rfcomm snd_soc_sst_cht_bsw_rt5645 snd_intel_sst_acpi snd_soc_sst_acpi snd_intel_sst_core uvcvideo videobuf2_vmalloc snd_hda_codec_hdmi videobuf2_memops videobuf2_core memconsole_x86_legacy memconsole i2c_dev snd_hda_intel snd_hda_codec snd_soc_sst_mfld_platform snd_hwdep snd_hda_core zram snd_soc_rt5645 snd_soc_rl6231 fuse ip6table_filter iwlmvm iwlwifi iwl7000_mac80211 cfg80211 btusb btrtl btbcm btintel bluetooth joydev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device ppp_async ppp_generic slhc tun
<4>[ 8422.283636] CPU: 2 PID: 15709 Comm: kworker/u8:23 Tainted: G        W      3.18.0-12910-g1f90d95 #1
<4>[ 8422.283638] Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.17 05/21/2016
<4>[ 8422.283644] Workqueue: events_unbound async_run_entry_fn
<4>[ 8422.283645] task: ffff88017a33a480 ti: ffff880044274000 task.ti: ffff880044274000
<4>[ 8422.283650] RIP: 0010:[<ffffffff829aa84f>]  [<ffffffff829aa84f>] intel_fbdev_set_suspend+0x9a/0xcf
<4>[ 8422.283651] RSP: 0000:ffff880044277c98  EFLAGS: 00010246
<4>[ 8422.283652] RAX: 0000000000000000 RBX: ffff880078840000 RCX: 000000000000cbca
<4>[ 8422.283653] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffffffff832ee030
<4>[ 8422.283655] RBP: ffff880044277cb8 R08: 0000000000000001 R09: ffffffff8292ffc2
<4>[ 8422.283656] R10: ffff88017b2b2430 R11: 0000000000000005 R12: 0000000000000000
<4>[ 8422.283657] R13: ffff88017a1dbc00 R14: ffff88017a159000 R15: 0000000000000010
<4>[ 8422.283659] FS:  0000000000000000(0000) GS:ffff88017fd00000(0000) knlGS:0000000000000000
<4>[ 8422.283660] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[ 8422.283661] CR2: 00000000000000e0 CR3: 00000000782a6000 CR4: 00000000001007e0
<4>[ 8422.283662] Stack:
<4>[ 8422.283665]  ffff88017b2b2000 ffff880078840000 ffff880078848ac0 ffff88007884a4d0
<4>[ 8422.283668]  ffff880044277ce8 ffffffff82935f61 ffff88017a8f2800 ffff88017a8f2898
<4>[ 8422.283670]  ffffffff82e6d000 000007a901a3b631 ffff880044277cf8 ffffffff82935fc9
<4>[ 8422.283671] Call Trace:
<4>[ 8422.283678]  [<ffffffff82935f61>] i915_drm_resume+0x128/0x169
<4>[ 8422.283680]  [<ffffffff82935fc9>] i915_pm_resume+0x27/0x29
<4>[ 8422.283684]  [<ffffffff82884057>] pci_pm_resume+0xc7/0xf1
<4>[ 8422.283688]  [<ffffffff82690afd>] ? ktime_get+0x41/0x52
<4>[ 8422.283691]  [<ffffffff82883f90>] ? store_new_id+0x198/0x198
<4>[ 8422.283695]  [<ffffffff829e3327>] dpm_run_callback+0x4a/0xae
<4>[ 8422.283698]  [<ffffffff829e3891>] device_resume+0x184/0x1c6
<4>[ 8422.283701]  [<ffffffff829e38f1>] async_resume+0x1e/0x45
<4>[ 8422.283703]  [<ffffffff8265bab0>] async_run_entry_fn+0x38/0xcf
<4>[ 8422.283707]  [<ffffffff826546a6>] process_one_work+0x176/0x2d4
<4>[ 8422.283710]  [<ffffffff8265502b>] worker_thread+0x1ec/0x2bf
<4>[ 8422.283712]  [<ffffffff82654e3f>] ? rescuer_thread+0x2db/0x2db
<4>[ 8422.283715]  [<ffffffff826590f5>] kthread+0x10e/0x116
<4>[ 8422.283719]  [<ffffffff82658fe7>] ? __kthread_parkme+0x67/0x67
<4>[ 8422.283724]  [<ffffffff82c6afec>] ret_from_fork+0x7c/0xb0
<4>[ 8422.283726]  [<ffffffff82658fe7>] ? __kthread_parkme+0x67/0x67
<4>[ 8422.283754] Code: 35 ef 78 94 00 48 8d 93 90 9c 00 00 bf 04 00 00 00 e8 80 8e ca ff eb 3f 45 85 e4 75 2a 49 8b 86 a0 00 00 00 48 8b 80 a8 00 00 00 <48> 83 b8 e0 00 00 00 00 74 12 49 8b 8d 90 03 00 00 49 8b bd 88
<1>[ 8422.283757] RIP  [<ffffffff829aa84f>] intel_fbdev_set_suspend+0x9a/0xcf
<4>[ 8422.283758]  RSP <ffff880044277c98>
<4>[ 8422.283759] CR2: 00000000000000e0
<4>[ 8422.283761] ---[ end trace 156fbe4b36f3bbd5 ]---
<6>[ 8422.292602] iwlwifi 0000:02:00.0: L1 Enabled - LTR Disabled
<6>[ 8422.292924] iwlwifi 0000:02:00.0: L1 Enabled - LTR Disabled
<0>[ 8422.301939] Kernel panic - not syncing: Fatal exception
<0>[ 8422.301949] Kernel Offset: 0x1600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
<0>[ 8422.302116] gsmi: Log Shutdown Reason 0x02

Comment 12 by dtor@chromium.org, Nov 1 2016

Seeing quite a few reports from Celes devices from the field.
Just a quick FYI,I found stability in version 56. Been running fine for weeks now.
Labels: M-54
Status: Assigned (was: Unconfirmed)
Platform team is not able to reproduce kernel crash on Edger and Celes recovered with USB stick in 8743.83.0 / 54.0.2840.93. 
Labels: -ReleaseBlock-Stable
I couldn't tell you if still happens since I've moved on up and probably won't be using Beta and Stable for a while 
Same pattern is found in partner sighting, affect R54/55 user mode through recovery image.

Comment 19 by willg...@gmail.com, Nov 10 2016

Still happening: https://redd.it/5boqtr


As post-freon cleanup disabled legacy FBDEV interfaces, perhaps we shall evaluate latest R55 again w.r.t to CL:411501; it landed today while ToT (current R56) patch is on Oct. 28th.

Readers could refer to:
https://bugs.chromium.org/p/chromium/issues/detail?id=655820
https://chromium-review.googlesource.com/#/c/411501/

I built latest chromeos base image and tested in normal mode, it looks promising.

Labels: -M-54
Status: Fixed (was: Assigned)
yes we had a bunch of fbdev-related crashes, and fbdev is now disabled, so this should go away. I backported the fbdev disabling change to 55 as well for kernel 3.18. We're not changing 54 since that's too old.

I'll mark this fixed now, to let verification happen.

Comment 22 by son...@google.com, Nov 18 2016

Status: Verified (was: Fixed)
verified on build 8872.54.0
Did not notice any crashes.

Comment 23 by djeche@google.com, Dec 1 2016

Cc: djeche@chromium.org
Labels: Hotlist-Enterprise

Comment 24 Deleted

Comment 25 Deleted

May you grant my access to issue 676281? or grab the crash dump to here in case there is concern.
Issue 670263 has been merged into this issue.
Labels: M-58
Status: Assigned (was: Verified)
Crash still seen on M58 - Reopening the bug

Crashes to black screen and reboots

https://crash.corp.google.com/browse?stbtiq=c30a631660000000
Status: Fixed (was: Assigned)
the crash you're pointing at is in zram, so it's a different bug. Please open a separate bug.
Issue 780841 has been merged into this issue.

Comment 31 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)
Labels: -Pri-1 Pri-2
Status: Assigned (was: Archived)
Re opening since this appears to be the top cause of NULL Pointer derefences in the kernel for terra (also braswell - same as edgar). 

===== feedback-TERRA-kernel-9901.77.0-20180129.txt =====
Total Kernel crash reports found:  16480
...
NULL_pointer : 2932
     89 mutex_lock+0x21/0x3f
    124 intel_chv_clip_cursor.isra.79+0x1c9/0x2b2
    160 intel_mmio_flip_work_func+0x25f/0x315
    298 [xpad]
    898 intel_fbdev_set_suspend+0x9a/0xcf
...

I should have looked at a newer release: the issue is still present but not as frequent:
===== feedback-TERRA-kernel-10032.86.0-20180129.txt =====
Total Kernel crash reports found:  5001
...
NULL_pointer : 556
     22 mutex_lock+0x21/0x3f
     34 intel_mmio_flip_work_func+0x25f/0x315
     43 intel_chv_clip_cursor.isra.79+0x1c9/0x2b2
     95 [xpad]
    163 intel_fbdev_set_suspend+0x9a/0xcf
...
Grant, do you have a link to a crash with fbdev? I don't see any.
Stephane, How many do you want?
It's easy for me to send you a tarball with dmesg output from the 163 I have for 10032.86.0 release.

Here are some psuedo-randomly selected report IDs you can use to grab stack traces from crash.corp:
0eeb3c223c67cfdb
1afda96fbf15473e
2923a61987b52d82
44747de320d5433f
5c1f35a15beb4724
65484809d68056c2
7417101539e09869
bbe74fc36f7bbde6
cc8687b9d40908f2
fc4d5dbc4b226fde

Use:
 ReportID='5c5da633a7c3f738'

as the query in crash.corp.


You might also be interested in two more which have a different symptom but similar stack trace:

d8c7e9f16fe1d5fb which reported:
<6>[ 6615.093577] call hdaudioC0D2+ returned 0 after 4106 usecs
<4>[ 6615.111637] general protection fault: 0000 [#1] PREEMPT SMP 
<0>[ 6615.115202] gsmi: Log Shutdown Reason 0x03
...
<4>[ 6615.115249] RIP: 0010:[<ffffffff889cefaf>]  [<ffffffff889cefaf>] intel_fbdev_set_suspend+0x9a/0xcf
...
<4>[ 6615.115269] Call Trace:
<4>[ 6615.115275]  [<ffffffff8895ab0c>] i915_drm_resume+0x128/0x169
<4>[ 6615.115278]  [<ffffffff8895ab74>] i915_pm_resume+0x27/0x29
<4>[ 6615.115281]  [<ffffffff888a8bd7>] pci_pm_resume+0xc7/0xf1
<4>[ 6615.115285]  [<ffffffff886b3d8d>] ? ktime_get+0x41/0x52
<4>[ 6615.115287]  [<ffffffff888a8b10>] ? store_new_id+0x198/0x198
<4>[ 6615.115290]  [<ffffffff88a07a84>] dpm_run_callback+0x4a/0xae
<4>[ 6615.115293]  [<ffffffff88a07fee>] device_resume+0x184/0x1c6
<4>[ 6615.115295]  [<ffffffff88a0804e>] async_resume+0x1e/0x45
<4>[ 6615.115298]  [<ffffffff8867ed40>] async_run_entry_fn+0x38/0xcf
<4>[ 6615.115301]  [<ffffffff88677936>] process_one_work+0x176/0x2d4
<4>[ 6615.115304]  [<ffffffff886782bb>] worker_thread+0x1ec/0x2bf
<4>[ 6615.115307]  [<ffffffff886780cf>] ? rescuer_thread+0x2db/0x2db
<4>[ 6615.115309]  [<ffffffff8867c385>] kthread+0x10e/0x116
<4>[ 6615.115312]  [<ffffffff8867c277>] ? __kthread_parkme+0x67/0x67
<4>[ 6615.115316]  [<ffffffff88c900ac>] ret_from_fork+0x7c/0xb0
<4>[ 6615.115319]  [<ffffffff8867c277>] ? __kthread_parkme+0x67/0x67


ea1e6144a7b20e0b which reported:
<1>[18383.485200] BUG: unable to handle kernel paging request at 00001000000000e0
<1>[18383.485208] IP: [<ffffffff9c3cefaf>] intel_fbdev_set_suspend+0x9a/0xcf
<4>[18383.485210] PGD 0 
<4>[18383.485213] Oops: 0000 [#1] PREEMPT SMP 
<0>[18383.488764] gsmi: Log Shutdown Reason 0x03

[ Note the extra "00001" in the faulting address ]
...
<4>[18383.488805] RIP: 0010:[<ffffffff9c3cefaf>]  [<ffffffff9c3cefaf>] intel_fbdev_set_suspend+0x9a/0xcf
...
<4>[18383.488824] Call Trace:
<4>[18383.488830]  [<ffffffff9c35ab0c>] i915_drm_resume+0x128/0x169
<4>[18383.488833]  [<ffffffff9c35ab74>] i915_pm_resume+0x27/0x29
<4>[18383.488836]  [<ffffffff9c2a8bd7>] pci_pm_resume+0xc7/0xf1
<4>[18383.488840]  [<ffffffff9c0b3d8d>] ? ktime_get+0x41/0x52
<4>[18383.488842]  [<ffffffff9c2a8b10>] ? store_new_id+0x198/0x198
<4>[18383.488845]  [<ffffffff9c407a84>] dpm_run_callback+0x4a/0xae
<4>[18383.488848]  [<ffffffff9c407fee>] device_resume+0x184/0x1c6
<4>[18383.488850]  [<ffffffff9c40804e>] async_resume+0x1e/0x45
<4>[18383.488853]  [<ffffffff9c07ed40>] async_run_entry_fn+0x38/0xcf
<4>[18383.488857]  [<ffffffff9c077936>] process_one_work+0x176/0x2d4
<4>[18383.488859]  [<ffffffff9c0782bb>] worker_thread+0x1ec/0x2bf
<4>[18383.488862]  [<ffffffff9c0780cf>] ? rescuer_thread+0x2db/0x2db
<4>[18383.488864]  [<ffffffff9c07c385>] kthread+0x10e/0x116
<4>[18383.488867]  [<ffffffff9c07c277>] ? __kthread_parkme+0x67/0x67
<4>[18383.488871]  [<ffffffff9c6900ac>] ret_from_fork+0x7c/0xb0


[ Hrm...digression... interesting that there are two different versions of BIOS reporting this:
fgrep -l intel_fbdev_set_suspend */*.kcrash | xargs fgrep -h "Hardware name:" | cut -d ":" -f 2 | sort | uniq -c 
     16  GOOGLE Terra, BIOS Google_Terra.7287.154.102 08/19/2017
   1176  GOOGLE Terra, BIOS Google_Terra.7287.154.56 10/30/2016
]
Status: Fixed (was: Assigned)

Crash 0eeb3c223c67cfdb:
16368.644323] CPU: 0 PID: 16294 Comm: Chrome_ChildIOT Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash 1afda96fbf15473e:
[ 2792.763314] CPU: 1 PID: 2518 Comm: kworker/u4:4 Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash 2923a61987b52d82:
[10131.132309] CPU: 1 PID: 13912 Comm: kworker/u4:4 Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash 44747de320d5433f:
[31120.973899] CPU: 1 PID: 5872 Comm: kworker/u4:1 Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash 5c1f35a15beb4724:
(somehow can't open it)


Crash 65484809d68056c2:
[ 2117.321173] CPU: 0 PID: 25557 Comm: kworker/u4:3 Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash 7417101539e09869:
[ 9732.577301] CPU: 1 PID: 14636 Comm: kworker/u4:1 Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash bbe74fc36f7bbde6:
[19364.987231] CPU: 1 PID: 8345 Comm: kworker/u4:13 Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash cc8687b9d40908f2:
[ 3957.137370] CPU: 1 PID: 429 Comm: kworker/u4:2 Tainted: G        W      3.18.0-13101-g57e8190 #1


Crash fc4d5dbc4b226fde:
[ 4178.830564] CPU: 0 PID: 29805 Comm: kworker/u4:6 Tainted: G        W      3.18.0-13101-g57e8190 #1


To answer your question "how many do I want", I want to see at least one with a kernel that's not antiquated. fbdev has been disabled for ages, and if your kernel is recent, it's impossible for these crashes to happen.

Right now I think the crash reports are corrupted, and I stand by my opinion that this bug was indeed fixed, so I am closing it.


Oh and the crux of the matter here is obviously that 57e8190, which is the kernel reporting those crashes, which I found in every single crash, is a commit from 2016-11-10.

I think you are chasing ghosts and I'm not interested in doing that.
Status: Verified (was: Fixed)
Yup - agreed. Sorry for the noise. I'm certainly not interested in chasing ghosts either.

Two kernel versions are listed in the crashes for 10032.86.0 query:
grundler <2092>fgrep -m 1 -R "Linux version" | sed 's/.*Linux version //' | cut -d " " -f 1 | sort | uniq -c | sort -n
    127 3.18.0-13101-g57e8190
   1654 3.18.0-16288-g64d05cf80004

And none of the crash reports with 64d05cf80004 identifier include any mention of intel_fbdev_set_suspend:

fgrep -Rl "Linux version 3.18.0-16288-g64d05cf80004" | xargs fgrep -l intel_fbdev | wc -l
0

To show the current full kernel version string (with time stamp):
NULL_pointer/upload_file_kcrash-edf64d23a1df65ae.kcrash:<5>[    0.000000] Linux version 3.18.0-16288-g64d05cf80004 (chrome-bot@cros-beefy264-c2) (gcc version 4.9.x 20150123 (prerelease) (4.9.2_cos_gg_4.9.2-r166-0c5a656a1322e137fa4a251f2ccc6c4022918c0a_4.9.2-r166) ) #1 SMP PREEMPT Mon Jan 8 23:22:23 PST 2018

Sign in to add a comment