New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 855151 link

Starred by 9 users

Issue metadata

Status: Duplicate
Owner:
Closed: Jul 7
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocked on:
issue 851138
issue 855145
issue 860017



Sign in to add a comment

CleanupChroot hanging.

Project Member Reported by dgarr...@chromium.org, Jun 21 2018

Issue description

I'm seeing some examples of the cbuildbot_launch stage CleanupChroot hanging or failing from time to time.

An example:

https://ci.chromium.org/p/chromeos/builders/luci.chromeos.general/Prod/b8943096145619074112

This is causing otherwise good builds to be reported as failures.
 
Cc: llozano@chromium.org manojgupta@chromium.org
Labels: OS-Chrome
There are plenty of reds that are most likely caused by this issue in https://cros-goldeneye.corp.google.com/chromeos/legoland/builderHistory?buildConfig=chromiumos-sdk&buildBranch=master
Summary: CleanupChroot hanging. (was: CleaningUpChroot hanging.)
Owner: athilenius@chromium.org
Status: Assigned (was: Untriaged)
Cc: bmgordon@chromium.org
Owner: dgarr...@chromium.org
I'll be tomorrow's Bobby, so.... will probably need to start working this.
Status: Started (was: Assigned)
These are kernel panics.

From the builder than ran this build:

https://ci.chromium.org/p/chromeos/builders/luci.chromeos.general/Prod/b8943092094950437792


[56529.786894] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.583503] INFO: task jbd2/dm-9-8:12760 blocked for more than 120 seconds.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.590640]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.596422] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604389] jbd2/dm-9-8     D ffff88336f373b00     0 12760      2 0x00000000
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604394]  ffff882286b11b28 0000000000000046 ffff883303586000 0000000000013b00
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604398]  ffff882286b11fd8 0000000000013b00 ffff883303586000 ffff88336f374398
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604401]  ffff88343ffa70e8 0000000000000002 ffffffff81155d00 ffff882286b11ba0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604404] Call Trace:
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604412]  [<ffffffff81155d00>] ? wait_on_page_read+0x60/0x60
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604422]  [<ffffffff8173bcbd>] io_schedule+0x9d/0x130
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604427]  [<ffffffff81155d0e>] sleep_on_page+0xe/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604430]  [<ffffffff8173c134>] __wait_on_bit+0x64/0x90
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604433]  [<ffffffff81155acf>] wait_on_page_bit+0x7f/0x90
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604437]  [<ffffffff810b0d00>] ? autoremove_wake_function+0x40/0x40
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604440]  [<ffffffff81155bd9>] filemap_fdatawait_range+0xf9/0x190
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604442]  [<ffffffff81155c97>] filemap_fdatawait+0x27/0x30
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604447]  [<ffffffff812949ef>] jbd2_journal_commit_transaction+0xb0f/0x1ba0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604452]  [<ffffffff81079ecf>] ? try_to_del_timer_sync+0x4f/0x70
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604455]  [<ffffffff8129934d>] kjournald2+0xbd/0x240
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604463]  [<ffffffff810b0cc0>] ? prepare_to_wait_event+0x100/0x100
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604468]  [<ffffffff81299290>] ? commit_timeout+0x10/0x10
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604471]  [<ffffffff81090c9b>] kthread+0xcb/0xf0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604476]  [<ffffffff81090bd0>] ? kthread_create_on_node+0x1c0/0x1c0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604480]  [<ffffffff8174850e>] ret_from_fork+0x6e/0xa0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604482]  [<ffffffff81090bd0>] ? kthread_create_on_node+0x1c0/0x1c0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.604523] INFO: task rm:23386 blocked for more than 120 seconds.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.610835]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.616613] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624578] rm              D ffff88336f493b00     0 23386  23368 0x00000000
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624583]  ffff88016aea5c68 0000000000000086 ffff8819f1c91800 0000000000013b00
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624586]  ffff88016aea5fd8 0000000000013b00 ffff8819f1c91800 ffff88336f494398
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624590]  ffff88343ffb0ce8 0000000000000002 ffffffff81155d00 ffff88016aea5ce0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624593] Call Trace:
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624600]  [<ffffffff81155d00>] ? wait_on_page_read+0x60/0x60
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624612]  [<ffffffff8173bcbd>] io_schedule+0x9d/0x130
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624616]  [<ffffffff81155d0e>] sleep_on_page+0xe/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624618]  [<ffffffff8173c134>] __wait_on_bit+0x64/0x90
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624621]  [<ffffffff81155acf>] wait_on_page_bit+0x7f/0x90
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624625]  [<ffffffff810b0d00>] ? autoremove_wake_function+0x40/0x40
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624630]  [<ffffffff8116582e>] truncate_inode_pages_range+0x2fe/0x5a0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624632]  [<ffffffff81165ae5>] truncate_inode_pages+0x15/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624637]  [<ffffffff8124f091>] ext4_evict_inode+0x131/0x510
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624640]  [<ffffffff811e0d03>] evict+0xb3/0x1b0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624649]  [<ffffffff811e1518>] iput+0xf8/0x190
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624653]  [<ffffffff811d5e3e>] do_unlinkat+0x18e/0x2b0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624658]  [<ffffffff811c71ae>] ? ____fput+0xe/0x10
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624662]  [<ffffffff8108d9c4>] ? task_work_run+0xb4/0xd0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624667]  [<ffffffff81014ed7>] ? do_notify_resume+0x97/0xb0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624670]  [<ffffffff811d6d6b>] SyS_unlinkat+0x1b/0x40
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624673]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.624682] INFO: task x86_64-cros-lin:24543 blocked for more than 120 seconds.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.632129]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.637919] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645884] x86_64-cros-lin D ffffffff81155d00     0 24543  24042 0x00000000
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645890]  ffff880122de1b38 0000000000000082 ffff88043f8a4800 0000000000013b00
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645896]  ffff880122de1fd8 0000000000013b00 ffff88043f8a4800 ffff88336f554398
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645900]  ffff88343ffb12e8 0000000000000002 ffffffff81155d00 ffff880122de1bb0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645904] Call Trace:
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645911]  [<ffffffff81155d00>] ? wait_on_page_read+0x60/0x60
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645917]  [<ffffffff8173bcbd>] io_schedule+0x9d/0x130
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645924]  [<ffffffff81155d0e>] sleep_on_page+0xe/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645926]  [<ffffffff8173c134>] __wait_on_bit+0x64/0x90
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645930]  [<ffffffff81155acf>] wait_on_page_bit+0x7f/0x90
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645936]  [<ffffffff810b0d00>] ? autoremove_wake_function+0x40/0x40
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645940]  [<ffffffff8116582e>] truncate_inode_pages_range+0x2fe/0x5a0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645943]  [<ffffffff81165ae5>] truncate_inode_pages+0x15/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645947]  [<ffffffff8124f091>] ext4_evict_inode+0x131/0x510
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645950]  [<ffffffff811e0d03>] evict+0xb3/0x1b0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645953]  [<ffffffff811e1518>] iput+0xf8/0x190
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645957]  [<ffffffff811dc3c8>] __dentry_kill+0x198/0x1f0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645960]  [<ffffffff811dc4cd>] dput+0xad/0x190
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645962]  [<ffffffff811d61f4>] SYSC_renameat+0x294/0x420
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645967]  [<ffffffff8118772c>] ? vm_munmap+0x4c/0x60
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645969]  [<ffffffff811d74eb>] SyS_rename+0x1b/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645973]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.645976] INFO: task x86_64-cros-lin:24567 blocked for more than 120 seconds.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.653427]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.659329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667279] x86_64-cros-lin D ffff88336f253b00     0 24567  24173 0x00000000
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667284]  ffff88331b919c80 0000000000000082 ffff88013c059800 0000000000013b00
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667288]  ffff88331b919fd8 0000000000013b00 ffff88013c059800 ffff88010d1cb728
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667291]  ffff88010d1cb72c ffff88013c059800 00000000ffffffff ffff88010d1cb730
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667294] Call Trace:
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667301]  [<ffffffff8173bea9>] schedule_preempt_disabled+0x29/0x70
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667310]  [<ffffffff8173dcf5>] __mutex_lock_slowpath+0x135/0x1b0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667314]  [<ffffffff8173dd8f>] mutex_lock+0x1f/0x2f
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667319]  [<ffffffff811d40c0>] do_last+0x2c0/0x1370
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667324]  [<ffffffff812e251c>] ? security_file_alloc+0x1c/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667326]  [<ffffffff811d522b>] path_openat+0xbb/0x650
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667329]  [<ffffffff811d662a>] do_filp_open+0x3a/0x90
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667336]  [<ffffffff811e3517>] ? __alloc_fd+0xa7/0x130
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667341]  [<ffffffff811c4739>] do_sys_open+0x129/0x2a0
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667344]  [<ffffffff811c48ce>] SyS_open+0x1e/0x20
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667349]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.667352] INFO: task x86_64-cros-lin:24576 blocked for more than 120 seconds.
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.674783]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:33 swarm-cros-5 kernel: [56529.680563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688522] x86_64-cros-lin D ffff88336f393b00     0 24576  23991 0x00000000
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688526]  ffff8801529a3cf0 0000000000000086 ffff88013436c800 0000000000013b00
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688530]  ffff8801529a3fd8 0000000000013b00 ffff88013436c800 ffff88010d1c8158
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688533]  ffff88010d1c815c ffff88013436c800 00000000ffffffff ffff88010d1c8160
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688536] Call Trace:
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688543]  [<ffffffff8173bea9>] schedule_preempt_disabled+0x29/0x70
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688552]  [<ffffffff8173dcf5>] __mutex_lock_slowpath+0x135/0x1b0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688559]  [<ffffffff811c7a69>] ? __sb_start_write+0x49/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688561]  [<ffffffff8173dd8f>] mutex_lock+0x1f/0x2f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688564]  [<ffffffff811d0460>] lock_rename+0xc0/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688567]  [<ffffffff811d610f>] SYSC_renameat+0x1af/0x420
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688572]  [<ffffffff8118772c>] ? vm_munmap+0x4c/0x60
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688575]  [<ffffffff811d74eb>] SyS_rename+0x1b/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688578]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.688585] INFO: task x86_64-cros-lin:24578 blocked for more than 120 seconds.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.696015]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.701803] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709769] x86_64-cros-lin D ffff88336f3d3b00     0 24578  24232 0x00000000
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709774]  ffff883304d8bb38 0000000000000086 ffff8801393e3000 0000000000013b00
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709778]  ffff883304d8bfd8 0000000000013b00 ffff8801393e3000 ffff88336f3d4398
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709781]  ffff88343ffa4ce8 0000000000000002 ffffffff81155d00 ffff883304d8bbb0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709784] Call Trace:
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709792]  [<ffffffff81155d00>] ? wait_on_page_read+0x60/0x60
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709797]  [<ffffffff8173bcbd>] io_schedule+0x9d/0x130
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709805]  [<ffffffff81155d0e>] sleep_on_page+0xe/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709811]  [<ffffffff8173c134>] __wait_on_bit+0x64/0x90
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709813]  [<ffffffff81155acf>] wait_on_page_bit+0x7f/0x90
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709818]  [<ffffffff810b0d00>] ? autoremove_wake_function+0x40/0x40
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709822]  [<ffffffff8116582e>] truncate_inode_pages_range+0x2fe/0x5a0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709824]  [<ffffffff81165ae5>] truncate_inode_pages+0x15/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709829]  [<ffffffff8124f091>] ext4_evict_inode+0x131/0x510
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709832]  [<ffffffff811e0d03>] evict+0xb3/0x1b0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709840]  [<ffffffff811e1518>] iput+0xf8/0x190
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709846]  [<ffffffff811dc3c8>] __dentry_kill+0x198/0x1f0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709848]  [<ffffffff811dc4cd>] dput+0xad/0x190
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709853]  [<ffffffff811d61f4>] SYSC_renameat+0x294/0x420
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709858]  [<ffffffff8118772c>] ? vm_munmap+0x4c/0x60
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709860]  [<ffffffff811d74eb>] SyS_rename+0x1b/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709864]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.709866] INFO: task x86_64-cros-lin:24608 blocked for more than 120 seconds.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.717303]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.723086] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731047] x86_64-cros-lin D ffff88336f373b00     0 24608  24154 0x00000000
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731052]  ffff883306e99cf0 0000000000000082 ffff8810d52b6000 0000000000013b00
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731056]  ffff883306e99fd8 0000000000013b00 ffff8810d52b6000 ffff88010d1c8158
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731059]  ffff88010d1c815c ffff8810d52b6000 00000000ffffffff ffff88010d1c8160
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731062] Call Trace:
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731069]  [<ffffffff8173bea9>] schedule_preempt_disabled+0x29/0x70
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731072]  [<ffffffff8173dcf5>] __mutex_lock_slowpath+0x135/0x1b0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731077]  [<ffffffff811c7a69>] ? __sb_start_write+0x49/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731079]  [<ffffffff8173dd8f>] mutex_lock+0x1f/0x2f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731082]  [<ffffffff811d0460>] lock_rename+0xc0/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731085]  [<ffffffff811d610f>] SYSC_renameat+0x1af/0x420
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731090]  [<ffffffff8118772c>] ? vm_munmap+0x4c/0x60
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731092]  [<ffffffff811d74eb>] SyS_rename+0x1b/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731096]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.731099] INFO: task x86_64-cros-lin:24609 blocked for more than 120 seconds.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.738557]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.744343] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752324] x86_64-cros-lin D ffff88336f233b00     0 24609  24243 0x00000000
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752329]  ffff880013141cf0 0000000000000086 ffff882e1b678000 0000000000013b00
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752333]  ffff880013141fd8 0000000000013b00 ffff882e1b678000 ffff88010d1c8158
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752336]  ffff88010d1c815c ffff882e1b678000 00000000ffffffff ffff88010d1c8160
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752339] Call Trace:
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752347]  [<ffffffff8173bea9>] schedule_preempt_disabled+0x29/0x70
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752355]  [<ffffffff8173dcf5>] __mutex_lock_slowpath+0x135/0x1b0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752363]  [<ffffffff811c7a69>] ? __sb_start_write+0x49/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752366]  [<ffffffff8173dd8f>] mutex_lock+0x1f/0x2f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752369]  [<ffffffff811d0460>] lock_rename+0xc0/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752372]  [<ffffffff811d610f>] SYSC_renameat+0x1af/0x420
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752376]  [<ffffffff8118772c>] ? vm_munmap+0x4c/0x60
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752379]  [<ffffffff811d74eb>] SyS_rename+0x1b/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752383]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.752390] INFO: task x86_64-cros-lin:24616 blocked for more than 120 seconds.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.759826]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.765626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773585] x86_64-cros-lin D ffff88336f2d3b00     0 24616  24256 0x00000000
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773590]  ffff88003632bcf0 0000000000000086 ffff8830ff65b000 0000000000013b00
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773594]  ffff88003632bfd8 0000000000013b00 ffff8830ff65b000 ffff88010d1c8158
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773597]  ffff88010d1c815c ffff8830ff65b000 00000000ffffffff ffff88010d1c8160
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773600] Call Trace:
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773607]  [<ffffffff8173bea9>] schedule_preempt_disabled+0x29/0x70
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773614]  [<ffffffff8173dcf5>] __mutex_lock_slowpath+0x135/0x1b0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773620]  [<ffffffff811c7a69>] ? __sb_start_write+0x49/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773622]  [<ffffffff8173dd8f>] mutex_lock+0x1f/0x2f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773625]  [<ffffffff811d0460>] lock_rename+0xc0/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773628]  [<ffffffff811d610f>] SYSC_renameat+0x1af/0x420
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773633]  [<ffffffff8118772c>] ? vm_munmap+0x4c/0x60
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773636]  [<ffffffff811d74eb>] SyS_rename+0x1b/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773643]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.773648] INFO: task x86_64-cros-lin:24618 blocked for more than 120 seconds.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.781094]       Not tainted 3.13.0-147-generic #196-Ubuntu
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.786894] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794892] x86_64-cros-lin D ffff88336f253b00     0 24618  24228 0x00000000
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794896]  ffff880036089cf0 0000000000000082 ffff880194499800 0000000000013b00
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794900]  ffff880036089fd8 0000000000013b00 ffff880194499800 ffff88010d1c8158
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794903]  ffff88010d1c815c ffff880194499800 00000000ffffffff ffff88010d1c8160
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794907] Call Trace:
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794914]  [<ffffffff8173bea9>] schedule_preempt_disabled+0x29/0x70
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794918]  [<ffffffff8173dcf5>] __mutex_lock_slowpath+0x135/0x1b0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794928]  [<ffffffff811c7a69>] ? __sb_start_write+0x49/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794933]  [<ffffffff8173dd8f>] mutex_lock+0x1f/0x2f
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794936]  [<ffffffff811d0460>] lock_rename+0xc0/0xe0
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794939]  [<ffffffff811d610f>] SYSC_renameat+0x1af/0x420
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794943]  [<ffffffff8118772c>] ? vm_munmap+0x4c/0x60
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794946]  [<ffffffff811d74eb>] SyS_rename+0x1b/0x20
Jun 21 23:46:34 swarm-cros-5 kernel: [56529.794950]  [<ffffffff817486f0>] system_call_fastpath+0x1a/0x1f
Jun 21 23:47:00 swarm-cros-5 puppet-agent[32316]: Applying configuration version '1529649697'
Jun 21 23:49:25 swarm-cros-5 puppet-agent[32316]: Finished catalog run in 145.24 seconds
Jun 22 00:45:12 swarm-cros-5 puppet-agent[28311]: Retrieving pluginfacts
Jun 22 00:45:12 swarm-cros-5 puppet-agent[28311]: Retrieving plugin
Jun 22 00:45:13 swarm-cros-5 puppet-agent[28311]: Loading facts
Jun 22 00:46:15 swarm-cros-5 puppet-agent[28311]: Caching catalog for swarm-cros-5.c.chromeos-bot.internal
Jun 22 00:46:16 swarm-cros-5 puppet-agent[28311]: Applying configuration version '1529652687'
Jun 22 00:46:20 swarm-cros-5 puppet-agent[28311]: Finished catalog run in 3.78 seconds
Jun 22 01:45:10 swarm-cros-5 puppet-agent[2222]: Retrieving pluginfacts
Jun 22 01:45:10 swarm-cros-5 puppet-agent[2222]: Retrieving plugin
Jun 22 01:45:11 swarm-cros-5 puppet-agent[2222]: Loading facts
Jun 22 01:45:36 swarm-cros-5 puppet-agent[2222]: Caching catalog for swarm-cros-5.c.chromeos-bot.internal
Jun 22 01:45:37 swarm-cros-5 puppet-agent[2222]: Applying configuration version '1529656411'
Jun 22 01:45:40 swarm-cros-5 puppet-agent[2222]: Finished catalog run in 3.92 seconds
Jun 22 02:45:16 swarm-cros-5 puppet-agent[12426]: Retrieving pluginfacts
Jun 22 02:45:16 swarm-cros-5 puppet-agent[12426]: Retrieving plugin
Jun 22 02:45:17 swarm-cros-5 puppet-agent[12426]: Loading facts
Jun 22 02:46:02 swarm-cros-5 puppet-agent[12426]: Caching catalog for swarm-cros-5.c.chromeos-bot.internal
Jun 22 02:46:03 swarm-cros-5 puppet-agent[12426]: Applying configuration version '1529659880'
Jun 22 02:46:08 swarm-cros-5 puppet-agent[12426]: Finished catalog run in 5.24 seconds
Jun 22 03:25:58 swarm-cros-5 dhclient: DHCPREQUEST of 10.0.2.126 on eth0 to 169.254.169.254 port 67 (xid=0x4be5daae)
Jun 22 03:25:58 swarm-cros-5 dhclient: DHCPACK of 10.0.2.126 from 169.254.169.254
Jun 22 03:25:58 swarm-cros-5 dhclient: bound to 10.0.2.126 -- renewal in 33439 seconds.
Jun 22 03:45:39 swarm-cros-5 puppet-agent[2376]: Retrieving pluginfacts
Jun 22 03:45:39 swarm-cros-5 puppet-agent[2376]: Retrieving plugin
Jun 22 03:45:41 swarm-cros-5 puppet-agent[2376]: Loading facts
Jun 22 03:46:15 swarm-cros-5 puppet-agent[2376]: Caching catalog for swarm-cros-5.c.chromeos-bot.internal
Jun 22 03:46:51 swarm-cros-5 puppet-agent[2376]: Applying configuration version '1529663485'
Jun 22 03:47:18 swarm-cros-5 puppet-agent[2376]: Finished catalog run in 26.90 seconds
Fri Jun 22 11:25:55 UTC 2018: shutting down for reboot [up 73284s].
Project Member

Comment 8 by bugdroid1@chromium.org, Jun 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/880ab82f2f2063bc9d7fb1b954eb2f6a540ae759

commit 880ab82f2f2063bc9d7fb1b954eb2f6a540ae759
Author: Don Garrett <dgarrett@google.com>
Date: Tue Jun 26 23:59:31 2018

cros_sdk_lib: Add timeout for CleanupChrootMount.

We are seeing kernel bugs that cause CleanupChrootMount to hang
forever, which can tie up a build machine for up to 24 hours. I'm not
certain if this will unhang the process or not, but it doesn't seem
like it can hurt.

BUG= chromium:855151 
TEST=run_tests

Change-Id: If5aa195dcbc539203af871c912069648ddee8131
Reviewed-on: https://chromium-review.googlesource.com/1112712
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/880ab82f2f2063bc9d7fb1b954eb2f6a540ae759/lib/cros_sdk_lib.py

Project Member

Comment 9 by bugdroid1@chromium.org, Jun 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/91a8cfc0b7666077c68c531b17619ad8a6b28b7c

commit 91a8cfc0b7666077c68c531b17619ad8a6b28b7c
Author: Don Garrett <dgarrett@google.com>
Date: Tue Jun 26 23:59:31 2018

cbuildbot_launch: Add metrics for CleanupChroot stage.

Capture metrics for the CleanupChroot stage so we can see how often it
fails.

BUG= chromium:855151 
TEST=run_tests
     cros tryjob --local success-build

Change-Id: Ia02a6e7a4ef9e0828bfa19477b47f7b7e0fd87c9
Reviewed-on: https://chromium-review.googlesource.com/1112713
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Mike Nichols <mikenichols@chromium.org>

[modify] https://crrev.com/91a8cfc0b7666077c68c531b17619ad8a6b28b7c/scripts/cbuildbot_launch.py

In theory, it might, maybe, be fixed.

If this doesn't work, the next step is much more drastic. We can disable to cleanup and depend on reboots, or we can stop using chroot.img.

I have started testing an update to the image used by ChromeOS bots (to see if we can pull in a kernel update), but the newest images are having unrelated issues.
Blockedon: 855145
Since updating the kernel may be our best bet, I'm blocking on the bug to fix the builder that generates our updated bot images.
Project Member

Comment 12 by bugdroid1@chromium.org, Jun 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e3d2bf14a3ded568e72df70feac71980d34da10c

commit e3d2bf14a3ded568e72df70feac71980d34da10c
Author: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Date: Wed Jun 27 02:57:50 2018

Roll src/third_party/chromite 1b45f0dc972b..47bb72284a37 (3 commits)

https://chromium.googlesource.com/chromiumos/chromite.git/+log/1b45f0dc972b..47bb72284a37


git log 1b45f0dc972b..47bb72284a37 --date=short --no-merges --format='%ad %ae %s'
2018-06-26 dgarrett@google.com lint: change type() checks to use isinstance() instead
2018-06-26 dgarrett@google.com cbuildbot_launch: Add metrics for CleanupChroot stage.
2018-06-26 dgarrett@google.com cros_sdk_lib: Add timeout for CleanupChrootMount.


Created with:
  gclient setdep -r src/third_party/chromite@47bb72284a37

The AutoRoll server is located here: https://chromite-chromium-roll.skia.org

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.



BUG=chromium:None,chromium:855151,chromium:855151
TBR=chrome-os-gardeners@chromium.org

Change-Id: I7a982eceeabd1233818cc7545bc7cdcf2c04cefd
Reviewed-on: https://chromium-review.googlesource.com/1116082
Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Cr-Commit-Position: refs/heads/master@{#570647}
[modify] https://crrev.com/e3d2bf14a3ded568e72df70feac71980d34da10c/DEPS

I upgraded the grunt-paladin builder to the 4.4 HWE kernel from Ubuntu a few weeks ago.  If that machine hasn't seen any more hangs (it hadn't before I went on leave), that might be worth pushing to the rest.  I hadn't seen any other bad side effects of the kernel upgrade.
Working on being able to generate new GCE images again (to fix an unrelated bug). After that's fixed, I think we should update the kernel on the image, and update all of our builders.
If that ends up taking longer than planned, you could probably also push out the equivalent of this with puppet to get the kernel everywhere:

apt-get install --install-recommends linux-generic-lts-xenial
Blockedon: 851138
Cc: wzang@chromium.org akes...@chromium.org gmeinke@chromium.org gu...@chromium.org jclinton@chromium.org
 Issue 859766  has been merged into this issue.
Looks like this is still happening but at least we are killing it after 4 minutes instead of 24 hours.

We might have to accelerate the kernel rollout.

We are also seeing problems with chroot cleanup during the start of the build, AND I'm having trouble getting an updated builder image that works.

I'm thinking the best thing is to disable chroot.img until this can be sorted out again.
Sounds good to me. This just became a higher priority, because it's now making a bunch of the non-informational Chrome PFQ builders also fail.
I created this bug last night, which similar, except that is can also commonly happen at the start of the build:  https://crbug.com/859984 

It had a totally different cause, and I've chumped in a hopeful fix.
Blockedon: 860017
Cc: nvaccaro@chromium.org djkurtz@chromium.org dgarr...@chromium.org shuqianz@chromium.org bhthompson@chromium.org xzhou@chromium.org mgild@chromium.org itspeter@chromium.org hywu@chromium.org sjg@chromium.org ihf@chromium.org philipchen@chromium.org aaboagye@chromium.org josa...@chromium.org cra...@chromium.org gwendal@chromium.org
 Issue 818874  has been merged into this issue.
issue 860017 is requesting an upgrade of the kernel version which would much quicker than rev'ing the OS image.
Cc: cmt...@chromium.org laszio@chromium.org
 Issue 860813  has been merged into this issue.
Mergedinto: 860508
Status: Duplicate (was: Started)
Based on analysis in issue 860508, I believe that the original kernel issue is solved and we are now facing a Swarming bug that just so happens to affect the exit status recorded on CleanupChroot because it happens to be the last one.
Thanks, that explanation seems likely to me too: when I've seen apparent CleanupChroot failures most recently, during the middle of the run the build page was always displaying the currently-running stage in purple.

Sign in to add a comment