Kernel hung in ext4 code during cros deploy |
||||||
Issue descriptionI was testing a self-built ToT (R58) on Kevin, emerging some new packages with 'cros deploy' when I hit a hung task warning -> panic. console-ramoops attached. [ 1800.116621] INFO: task ebuild.sh:5569 blocked for more than 120 seconds. [ 1800.123420] Not tainted 4.4.35 #893 [ 1800.127553] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1800.135437] ebuild.sh D ffffffc000204fd8 0 5569 5475 0x00400000 [ 1800.142693] Call trace: [ 1800.145262] [<ffffffc000204fd8>] __switch_to+0x9c/0xa8 [ 1800.150507] [<ffffffc0009215bc>] __schedule+0x478/0x748 [ 1800.155801] [<ffffffc000921920>] schedule+0x94/0xb4 [ 1800.160720] [<ffffffc000924a98>] schedule_timeout+0x44/0x27c [ 1800.166446] [<ffffffc000921104>] io_schedule_timeout+0x7c/0xbc [ 1800.172353] [<ffffffc000922278>] bit_wait_io+0x1c/0x6c [ 1800.177537] [<ffffffc000921e10>] __wait_on_bit+0x74/0xc0 [ 1800.182887] [<ffffffc000302720>] wait_on_page_bit+0xa4/0xc8 [ 1800.188577] [<ffffffc00031294c>] truncate_inode_pages_range+0x504/0x570 [ 1800.195396] [<ffffffc0003129e4>] truncate_inode_pages+0x2c/0x38 [ 1800.201347] [<ffffffc000312aa4>] truncate_pagecache+0x50/0x74 [ 1800.207108] [<ffffffc0003d7630>] ext4_setattr+0x448/0x694 [ 1800.212532] [<ffffffc00036fddc>] notify_change+0x20c/0x328 [ 1800.218029] [<ffffffc000352050>] do_truncate+0xac/0xe8 [ 1800.223167] [<ffffffc000362a08>] path_openat+0xbb4/0xd9c [ 1800.228492] [<ffffffc000363c74>] do_filp_open+0x58/0xbc [ 1800.233725] [<ffffffc0003532d8>] do_sys_open+0x184/0x2d8 [ 1800.239052] [<ffffffc0003a493c>] compat_SyS_open+0x38/0x48 [ 1800.244536] [<ffffffc000203e34>] el0_svc_naked+0x24/0x28 I was booted from an SD card image, FWIW. But from the looks of this, it's either generic VFS or EXT4 code that caused the lockup.
,
Sep 8 2017
,
Sep 8 2017
,
Sep 8 2017
Booted Kevin from sdcard using build 9901.7.0, 62.0.3202.9 dev. Did not notice any device hang. Attached device logs for reference.
,
Sep 8 2017
So this is an old bug and I'll point out that I use SD Card to boot up my kevin 100% of the time when I install a new kernel. I very, very rarely ever see any errors like the above. ...but somehow we do have some SD Card problems. Specifically: -- 1. There's a feedback report <https://listnr.corp.google.com/report/72651271882> that looks like someone is seeing a hang in accessing their SD card. The traceback from there looks like: [ 360.121372] INFO: task mount.exfat-fus:3707 blocked for more than 120 seconds. [ 360.121416] Not tainted 4.4.70-11295-gf64189f58926 #1 [ 360.121438] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.121461] mount.exfat-fus D ffffffc000204ff0 0 3707 2024 0x00400000 [ 360.121504] Call trace: [ 360.121545] [<ffffffc000204ff0>] __switch_to+0x9c/0xa8 [ 360.121578] [<ffffffc00093ab88>] __schedule+0x504/0x740 [ 360.121612] [<ffffffc00093ae4c>] schedule+0x88/0xa8 [ 360.121651] [<ffffffc00093eaa0>] schedule_timeout+0x44/0x27c [ 360.121683] [<ffffffc00093a644>] io_schedule_timeout+0x7c/0xbc [ 360.121718] [<ffffffc00093b7a4>] bit_wait_io+0x1c/0x6c [ 360.121750] [<ffffffc00093b33c>] __wait_on_bit+0x74/0xc0 [ 360.121779] [<ffffffc000302d18>] wait_on_page_bit+0xa4/0xc8 [ 360.121810] [<ffffffc000302e78>] __filemap_fdatawait_range+0xdc/0x158 [ 360.121841] [<ffffffc000302f24>] filemap_fdatawait_range+0x30/0x50 [ 360.121970] [<ffffffc000304d04>] filemap_write_and_wait_range+0x5c/0x88 [ 360.121999] [<ffffffc00039465c>] blkdev_fsync+0x34/0x60 [ 360.122057] [<ffffffc00038d104>] vfs_fsync_range+0xa4/0xb4 [ 360.122115] [<ffffffc00038d148>] vfs_fsync+0x34/0x40 [ 360.122165] [<ffffffc00038d418>] SyS_fsync+0x40/0x64 [ 360.122223] [<ffffffc000203e34>] el0_svc_naked+0x24/0x28 -- 2. We've seen lots of feedback reports and crash reports with SD card problems. Most are during tuning. That's captured by <https://buganizer.corp.google.com/issues/35580203>. ...in fact, if you look at crash reports for kevin a fairly significant percentage of them appear to be crashing during SD card tuning. It's unclear that the problems that we see at tuning time and the problems here are the same, but it's plausible.
,
Sep 9 2017
+sontis found this I have noticed "mount_exfat_fuse" process crash on 9765.53.0 Steps to repo this crash: 1. Start copying big file (~2GB) from device to Micro SD card. 2. Unplug Micro SD card in the middle of the copy operation. Result: "mount_exfat_fuse" process crash crash link: https://crash.corp.google.com/browse?q=reportid=%27d7db2b65bbb9bf1a%27 0xf193ab86 (libc-2.23.so + 0x00016b86 ) __libc_do_syscall 0xf1949de3 (libc-2.23.so -raise.c:54 ) raise 0xf194ad55 (libc-2.23.so -abort.c:89 ) abort 0xab76c605 (mount.exfat-fuse -log.c:50 ) exfat_bug 0xab76dcad (mount.exfat-fuse -node.c:601 ) reset_cache 0xab76dbe5 (mount.exfat-fuse -node.c:587 ) reset_cache 0xab76d56b (mount.exfat-fuse -mount.c:365 ) exfat_unmount 0xf1a34ed5 (libfuse.so.2.8.6 -fuse.c:1962 ) fuse_fs_destroy 0xf1a36321 (libfuse.so.2.8.6 -fuse.c:1975 ) fuse_lib_destroy 0xf1a40b6b (libfuse.so.2.8.6 -fuse_lowlevel.c:1260 ) do_destroy 0xf1a3eb5b (libfuse.so.2.8.6 -fuse_lowlevel.c:1514 ) fuse_ll_process 0xf1a3ce45 (libfuse.so.2.8.6 -fuse_loop.c:33 ) fuse_session_loop 0xab76a5eb (mount.exfat-fuse -main.c:627 ) main 0xf193a8b7 (libc-2.23.so -libc-start.c:289 ) __libc_start_main 0xab76a1e3 (mount.exfat-fuse + 0x000011e3 ) _start 0xab77126b (mount.exfat-fuse -elf-init.c:87 ) __libc_csu_init 0xf1a7af3f (ld-2.23.so + 0x0000bf3f ) _dl_sort_fini
,
Oct 16 2017
Can you please try to reproduce on 9901.46.0? Thanks! :) It's plausible this was fixed by the same thing that fixed b/35580203
,
Jun 15 2018
Did not notice this issue on ToT. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by diand...@chromium.org
, Feb 1 2017210 KB
210 KB View Download