New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 672714 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 673349
Owner: ----
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

elm: Kernel panic stack-protector in futex_wait_queue_me

Project Member Reported by djkurtz@chromium.org, Dec 9 2016

Issue description

Chrome Version: 55.0.2883.82
Chrome OS Version: 8872.67.0
Chrome OS Platform: Elm

Feedback: https://feedback.corp.google.com/#/Report/50024054363

Description:
my chromebook restarted twice in a row. it was running for few minutes (about 5) and reboot into the login. uptime shows that it's a reboot, not a logout.

Crash: https://crash.corp.google.com/browse?q=ReportID=39ab49bf00000000

Event Log shows a boot:

101 | 2016-12-09 00:04:19 | System boot | 0
102 | 2016-12-09 00:04:19 | Chrome OS Developer Mode

kcrash / console-ramoops shows the backtrace:

<4>[  193.765453] audit_printk_skb: 27 callbacks suppressed
<5>[  193.765463] audit: type=1400 audit(1481260061.952:1670): avc:  denied  { ioctl } for  pid=3502 comm="netfilter-queue" path="socket:[75971]" dev="sockfs" ino=75971 ioctlcmd=8910 scontext=u:r:chromeos:s0 tcontext=u:r:chromeos:s0 tclass=unix_dgram_socket permissive=1
<5>[  193.765765] audit: type=1400 audit(1481260061.952:1671): avc:  denied  { ioctl } for  pid=3502 comm="netfilter-queue" path="socket:[75973]" dev="sockfs" ino=75973 ioctlcmd=8910 scontext=u:r:chromeos:s0 tcontext=u:r:chromeos:s0 tclass=unix_dgram_socket permissive=1
<5>[  193.772462] audit: type=1400 audit(1481260061.956:1672): avc:  denied  { ioctl } for  pid=3502 comm="netfilter-queue" path="socket:[75975]" dev="sockfs" ino=75975 ioctlcmd=8910 scontext=u:r:chromeos:s0 tcontext=u:r:chromeos:s0 tclass=unix_dgram_socket permissive=1
<7>[ 239.875483] SELinux: initialized (dev proc, type proc), uses genfs_contexts
<7>[ 241.455734] SELinux: initialized (dev proc, type proc), uses genfs_contexts
<7>[ 241.502731] SELinux: initialized (dev proc, type proc), uses genfs_contexts
<7>[ 243.014002] SELinux: initialized (dev proc, type proc), uses genfs_contexts
<5>[ 247.340477] audit: type=1400 audit(1481260115.524:1673): avc: denied { rmdir } for pid=1391 comm="BrowserBlocking" name="index-dir" dev="ecryptfs" ino=262919 scontext=u:r:chromeos:s0 tcontext=u:object_r:unlabeled:s0 tclass=dir permissive=1
<0>[ 248.229596] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc000292c7c
<0>[ 248.229596]
<4>[ 248.229611] CPU: 2 PID: 16950 Comm: Compositor Not tainted 3.18.0-13434-gc3678fb #1
<4>[ 248.229615] Hardware name: Mediatek Elm rev3 board (DT)
<0>[ 248.229619] Call trace:
<4>[ 248.229633] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[ 248.229639] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[ 248.229645] [<ffffffc0008dc32c>] dump_stack+0x80/0xc4
<4>[ 248.229649] [<ffffffc0008daf90>] panic+0xfc/0x240
<4>[ 248.229655] [<ffffffc000222888>] __stack_chk_fail+0x20/0x24
<4>[ 248.229665] [<ffffffc000292c78>] futex_wait_queue_me+0x178/0x190
<4>[ 248.229670] [<ffffffc00029343c>] futex_wait+0x100/0x238
<4>[ 248.229674] [<ffffffc000294df4>] do_futex+0xec/0x894
<4>[ 248.229678] [<ffffffc000295b6c>] compat_SyS_futex+0xe0/0x168

Event Log shows the reboot was about 4 minutes later, which matches kcrash uptime timestamps:
103 | 2016-12-09 00:08:37 | System boot | 0
104 | 2016-12-09 00:08:37 | Chrome OS Developer Mode

 
upload_file_kcrash-39ab49bf00000000.kcrash
135 KB Download
It's not that rare, I see a few more on 8872.67.0:
https://feedback.corp.google.com/#/Report/bf3fc96300000000
https://feedback.corp.google.com/#/Report/fb4e916300000000
https://feedback.corp.google.com/#/Report/5f425b3f00000000

https://feedback.corp.google.com/#/Report/f4ebfd3f00000000 is also a stack-protector but with a different backtrace:

<0>[ 776.123448] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc0007584c0
<0>[ 776.123448]
<4>[ 776.123469] CPU: 1 PID: 8927 Comm: Binder_E Not tainted 3.18.0-13434-gc3678fb #1
<4>[ 776.123475] Hardware name: Mediatek Elm rev3 board (DT)
<0>[ 776.123480] Call trace:
<4>[ 776.123494] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[ 776.123501] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[ 776.123510] [<ffffffc0008dc32c>] dump_stack+0x80/0xc4
<4>[ 776.123517] [<ffffffc0008daf90>] panic+0xfc/0x240
<4>[ 776.123526] [<ffffffc000222888>] __stack_chk_fail+0x20/0x24
<4>[ 776.123536] [<ffffffc0007584bc>] binder_ioctl_write_read+0x2e8/0x328
<4>[ 776.123543] [<ffffffc0007587b4>] binder_ioctl+0x2b8/0x6d8
<4>[ 776.123553] [<ffffffc00038b014>] compat_SyS_ioctl+0x130/0x14d0

Cc: cernekee@chromium.org
I see 14 total stack_protector crashes on R55 Elm (8872.*.0):

The other 9 include:

futex:
1d68fea300000000 / 8872.65.0
5192c64f00000000 / 8872.65.0
dcc1853f00000000 / 8872.65.0
808e63a300000000 / 8872.65.0
5192c64f00000000 / 8872.65.0

And a few other unique backtraces:

ccf8c17700000000 / 8872.54.0

<0>[ 5793.678689] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc0007b772c
<0>[ 5793.678689]
<4>[ 5793.678708] CPU: 2 PID: 1347 Comm: AudioThread Not tainted 3.18.0-13409-g77f7066 #1
<4>[ 5793.678712] Hardware name: Mediatek Elm rev3 board (DT)
<0>[ 5793.678717] Call trace:
<4>[ 5793.678733] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[ 5793.678739] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[ 5793.678745] [<ffffffc0008dd328>] dump_stack+0x80/0xc4
<4>[ 5793.678750] [<ffffffc0008dbf8c>] panic+0xfc/0x240
<4>[ 5793.678756] [<ffffffc000222888>] __stack_chk_fail+0x20/0x24
<4>[ 5793.678763] [<ffffffc0007b7728>] compat_sock_ioctl+0xd10/0xd30
<4>[ 5793.678769] [<ffffffc00038c370>] compat_SyS_ioctl+0x147c/0x14d0

e99072df00000000 / 8872.65.0
e99072df00000000 / 8872.65.0

<6>[ 417.745243] mtk-afe-pcm 11220000.audio-controller: mtk_afe_dais_trigger DL1 cmd=0
<7>[ 421.008373] SELinux: initialized (dev proc, type proc), uses genfs_contexts
<0>[ 429.432843] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc000528ccc
<0>[ 429.432843]
<4>[ 429.432858] CPU: 2 PID: 9562 Comm: DrmThread Not tainted 3.18.0-13434-gc3678fb #1
<4>[ 429.432863] Hardware name: Mediatek Elm rev3 board (DT)
<0>[ 429.432868] Call trace:
<4>[ 429.432881] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[ 429.432887] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[ 429.432893] [<ffffffc0008dc32c>] dump_stack+0x80/0xc4
<4>[ 429.432898] [<ffffffc0008daf90>] panic+0xfc/0x240
<4>[ 429.432903] [<ffffffc000222888>] __stack_chk_fail+0x20/0x24
<4>[ 429.432909] [<ffffffc000528cc8>] drm_ioctl+0x28c/0x444
<4>[ 429.432914] [<ffffffc0005453b0>] drm_compat_ioctl+0x38/0x70


3682723f00000000 / 8872.65.0
<0>[ 1380.339070] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc0008dd580
<0>[ 1380.339070]
<4>[ 1380.339085] CPU: 3 PID: 13931 Comm: chrome Not tainted 3.18.0-13434-gc3678fb #1
<4>[ 1380.339090] Hardware name: Mediatek Elm rev3 board (DT)
<0>[ 1380.339094] Call trace:
<4>[ 1380.339108] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[ 1380.339114] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[ 1380.339121] [<ffffffc0008dc32c>] dump_stack+0x80/0xc4
<4>[ 1380.339126] [<ffffffc0008daf90>] panic+0xfc/0x240
<4>[ 1380.339132] [<ffffffc000222888>] __stack_chk_fail+0x20/0x24
<4>[ 1380.339136] [<ffffffc0008dd57c>] __schedule+0x6e4/0x704
<4>[ 1380.339141] [<ffffffc0008dd610>] schedule+0x74/0x80
<4>[ 1380.339145] [<ffffffc0008e075c>] do_nanosleep+0xac/0x15c
<4>[ 1380.339152] [<ffffffc00028333c>] hrtimer_nanosleep+0xac/0x13c
<4>[ 1380.339160] [<ffffffc00029ea28>] compat_SyS_nanosleep+0x90/0xfc

83796fa300000000 / 8872.65.0
<0>[14925.304819] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc0008dd580
<0>[14925.304819]
<4>[14925.304834] CPU: 3 PID: 7 Comm: rcu_preempt Not tainted 3.18.0-13434-gc3678fb #1
<4>[14925.304839] Hardware name: Mediatek Elm rev3 board (DT)
<0>[14925.304843] Call trace:
<4>[14925.304857] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[14925.304862] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[14925.304869] [<ffffffc0008dc32c>] dump_stack+0x80/0xc4
<4>[14925.304873] [<ffffffc0008daf90>] panic+0xfc/0x240
<4>[14925.304880] [<ffffffc000222888>] __stack_chk_fail+0x20/0x24
<4>[14925.304885] [<ffffffc0008dd57c>] __schedule+0x6e4/0x704
<4>[14925.304889] [<ffffffc0008dd610>] schedule+0x74/0x80
<4>[14925.304894] [<ffffffc0008e0598>] schedule_timeout+0x210/0x268
<4>[14925.304899] [<ffffffc00027a2c4>] rcu_gp_kthread+0x378/0x5bc
<4>[14925.304905] [<ffffffc000240d08>] kthread+0xf4/0x100

c4207fdf00000000 / 8872.65.0
<0>[ 32.929524] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc00034bc5c
<0>[ 32.929524]
<4>[ 32.929538] CPU: 3 PID: 7949 Comm: sdcard Not tainted 3.18.0-13434-gc3678fb #1
<4>[ 32.929542] Hardware name: Mediatek Elm rev3 board (DT)
<0>[ 32.929546] Call trace:
<4>[ 32.929560] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[ 32.929565] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[ 32.929571] [<ffffffc0008dc32c>] dump_stack+0x80/0xc4
<4>[ 32.929576] [<ffffffc0008daf90>] panic+0xfc/0x240
<4>[ 32.929581] [<ffffffc000222888>] __stack_chk_fail+0x20/0x24
<4>[ 32.929587] [<ffffffc00034bc58>] user_path_at_empty+0xac/0xc4
<4>[ 32.929592] [<ffffffc00034bca8>] user_path_at+0x38/0x48


Labels: Proj-Containers
Prior to 8872.* there are only 2 stack-protector crashes, which strongly suggests this is related to enabling the ARC++ container:

https://feedback.corp.google.com/#/Report/1955db4f00000000 / 8743.85.0

For this one, the system was already not doing well:

<7>[ 320.830979] Valid eCryptfs headers not found in file header region or xattr region, inode 1311054
<7>[ 320.833148] Valid eCryptfs headers not found in file header region or xattr region, inode 1312171
...
<4>[ 322.217777] __do_user_fault: 15 callbacks suppressed
<6>[ 322.217789] chrome[14074]: unhandled level 2 translation fault (11) at 0xf8000060, esr 0x92000006
<1>[ 322.217794] pgd = ffffffc089486000
<1>[ 322.217801] [f8000060] *pgd=00000000c9487003, *pud=00000000c9487003, *pmd=0000000000000000
<4>[ 322.217810]
<4>[ 322.217816] CPU: 2 PID: 14074 Comm: chrome Not tainted 3.18.0-13101-g57e8190 #1
<4>[ 322.217820] Hardware name: Mediatek Elm rev3 board (DT)
<4>[ 322.217824] task: ffffffc0b69924c0 ti: ffffffc0b58d4000 task.ti: ffffffc0b58d4000
<4>[ 322.217840] PC is at 0xf2862538
<4>[ 322.217844] LR is at 0xf2866f71
<4>[ 322.217848] pc : [<00000000f2862538>] lr : [<00000000f2866f71>] pstate: 200f0030
<4>[ 322.217852] sp : 00000000ffb99b44
<4>[ 322.217855] x12: 0000000000000002
<4>[ 322.217860] x11: 00000000ffb99bec x10: 00000000f870c098
<4>[ 322.217867] x9 : 00000000f8807ad0 x8 : 0000000000000080
<4>[ 322.217873] x7 : 00000000ffb99b48 x6 : 0000000000000000
<4>[ 322.217880] x5 : 00000000f91d956c x4 : 00000000f91d9398
<4>[ 322.217886] x3 : 000000007beddf00 x2 : 0000000000000001
<4>[ 322.217892] x1 : 0000000000000001 x0 : 00000000f8000058
...
<7>[ 340.473336] Valid eCryptfs headers not found in file header region or xattr region, inode 1313144
<7>[ 340.475664] Valid eCryptfs headers not found in file header region or xattr region, inode 1312977
...
<6>[ 727.940421] chrome[1753]: unhandled level 1 translation fault (11) at 0x6d61710e, esr 0x92000005
<1>[ 727.940460] pgd = ffffffc0b5ac2000
<1>[ 727.940469] [6d61710e] *pgd=0000000000000000, *pud=0000000000000000
<4>[ 727.940514]
<4>[ 727.940527] CPU: 1 PID: 1753 Comm: chrome Not tainted 3.18.0-13101-g57e8190 #1
<4>[ 727.940535] Hardware name: Mediatek Elm rev3 board (DT)
<4>[ 727.940569] task: ffffffc0fa973100 ti: ffffffc0b1b64000 task.ti: ffffffc0b1b64000
<4>[ 727.940595] PC is at 0xf0d66bba
<4>[ 727.940603] LR is at 0xf0d6814b
<4>[ 727.940611] pc : [<00000000f0d66bba>] lr : [<00000000f0d6814b>] pstate: a0000030
<4>[ 727.940618] sp : 00000000ffa2ea38
<4>[ 727.940640] x12: 00000000f0b6df1c
<4>[ 727.940650] x11: 00000000f7c4c66c x10: 00000000f7c4c644
<4>[ 727.940756] x9 : 0000000000000001 x8 : 0000000000000001
<4>[ 727.940772] x7 : 00000000ffa2ea38 x6 : 00000000f7c4c648
<4>[ 727.940809] x5 : 00000000f7d5e000 x4 : 000000006d616e22
<4>[ 727.940821] x3 : 00000000f80ad400 x2 : 0000000000000001
<4>[ 727.940856] x1 : 000000006d616e22 x0 : 00000000f7d5e000
<4>[ 727.940867]
<6>[ 728.234854] [MTK_V4L2] level=0 fops_vcodec_open(),158: decoder capability 6ca20004
<6>[ 728.234864] [MTK_V4L2] level=0 fops_vcodec_open(),166: 16000000.vcodec decoder [7]
<6>[ 728.234985] [MTK_V4L2] level=0 fops_vcodec_release(),188: [7] decoder
<6>[ 728.235084] [MTK_V4L2] level=0 fops_vcodec_open(),158: decoder capability 6ca20004
<6>[ 728.235090] [MTK_V4L2] level=0 fops_vcodec_open(),166: 16000000.vcodec decoder [8]
<6>[ 728.235133] [MTK_V4L2] level=0 fops_vcodec_release(),188: [8] decoder
<6>[ 728.235218] [MTK_V4L2] level=0 fops_vcodec_open(),185: encoder capability 10000000
<6>[ 728.235224] [MTK_V4L2] level=0 fops_vcodec_open(),196: 18002000.vcodec encoder [5]
<6>[ 728.973115] [drm] PS8640 PAGE1.0x6B = 0x4
<0>[ 733.022006] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc0008df018
<0>[ 733.022006]
<4>[ 733.022028] CPU: 2 PID: 14819 Comm: kworker/2:0 Not tainted 3.18.0-13101-g57e8190 #1
<4>[ 733.022037] Hardware name: Mediatek Elm rev3 board (DT)
<0>[ 733.022049] Call trace:
<4>[ 733.022069] [<ffffffc000208fe0>] dump_backtrace+0x0/0x160
<4>[ 733.022080] [<ffffffc00020915c>] show_stack+0x1c/0x28
<4>[ 733.022091] [<ffffffc0008dddc4>] dump_stack+0x80/0xc4
<4>[ 733.022100] [<ffffffc0008dca28>] panic+0xfc/0x240
<4>[ 733.022111] [<ffffffc000222868>] __stack_chk_fail+0x20/0x24
<4>[ 733.022119] [<ffffffc0008df014>] __schedule+0x6e4/0x704
<4>[ 733.022128] [<ffffffc0008df0a8>] schedule+0x74/0x80
<4>[ 733.022139] [<ffffffc00023bd38>] worker_thread+0x420/0x450
<4>[ 733.022148] [<ffffffc000240ce8>] kthread+0xf4/0x100
<2>[ 733.022159] CPU3: stopping


425f2c1f00000000 / 8350.77.0

This one died very soon after boot, much earlier than the others.

<6>[ 10.073099] Bluetooth: FW download over, size 802164 bytes
<3>[ 10.383273] sdio platform data not available
<3>[ 10.400129] mwifiex_sdio: sdio platform data not available<5>[ 10.402317] mwifiex: rx work enabled, cpus 4
<5>[ 10.518542] mwifiex_sdio mmc2:0001:1: WLAN is not the winner! Skip FW dnld
<6>[ 10.830590] mwifiex_sdio mmc2:0001:1: WLAN FW is active
<6>[ 10.898682] Bluetooth: RFCOMM socket layer initialized
<6>[ 10.898716] Bluetooth: RFCOMM ver 1.11
<6>[ 10.941276] mwifiex_sdio mmc2:0001:1: info: MWIFIEX VERSION: mwifiex 1.0 (15.68.7.p77)
<6>[ 10.941287] mwifiex_sdio mmc2:0001:1: driver_version = mwifiex 1.0 (15.68.7.p77)
<6>[ 10.981146] IPv6: ADDRCONF(NETDEV_UP): mlan0: link is not ready
<6>[ 15.180290] mwifiex_sdio mmc2:0001:1: info: trying to associate	to 'ZyXEL' bssid 00:00:00:00:00:01
<6>[ 15.210564] mwifiex_sdio mmc2:0001:1: info: associated to bssid 00:00:00:00:00:01 successfully
<6>[ 15.211158] IPv6: ADDRCONF(NETDEV_CHANGE): mlan0: link becomes ready
<1>[ 25.480122] Unhandled debug exception: aarch32 BKPT (0xe0000070) at 0x00000000416c7c00
<1>[ 25.480140] Unhandled debug exception: aarch32 BKPT (0xe000007f) at 0x00000000416c7c00
<0>[ 25.993523] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffc0008b28fc
<0>[ 25.993523]
<4>[ 25.993537] CPU: 2 PID: 117 Comm: mmcqd/0 Not tainted 3.18.0-12323-g72862e8 #1
<4>[ 25.993542] Hardware name: Mediatek Elm board (DT)
<0>[ 25.993547] Call trace:
<4>[ 25.993561] [<ffffffc000209030>] dump_backtrace+0x0/0x160
<4>[ 25.993568] [<ffffffc0002091ac>] show_stack+0x1c/0x28
<4>[ 25.993574] [<ffffffc0008b1294>] dump_stack+0x80/0xc4
<4>[ 25.993579] [<ffffffc0008af8bc>] panic+0x100/0x248
<4>[ 25.993586] [<ffffffc0002218cc>] __stack_chk_fail+0x20/0x24
<4>[ 25.993591] [<ffffffc0008b28f8>] __schedule+0x830/0x850
<4>[ 25.993595] [<ffffffc0008b298c>] schedule+0x74/0x80


Labels: -Proj-Containers
As for trends:

On 8743.* (M54 / stable channel) stack_protector crashes occur mostly on {Parrot, Snow, Blaze and Butterfly}, none of which have ARC++ enabled.

On 8872.* (M55 / beta channel) stack_protector crashes occur mostly on Elm and Snow.
  Elm probably due to a chromeos-3.18 / platform specific kernel bug being triggered by the sizable population of ARC++ users on the Beta channel.
  Snow probably because "stack_protector" is a generic kernel crash bucket, and there are just so many Snow devices.

Mergedinto: 673349
Status: Duplicate (was: Available)
These stack-protector crashes are another manifestation of the root cause of  issue 673349 .

Sign in to add a comment