CleanupStage failed "umount" after a fresh reboot. |
|
Issue descriptionThis build: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8931960173807897536 Ran on a bot that had rebooted after it's previous build (see https://crbug.com/895955#c5 ). However, The cleanup stage was unable to delete the chroot because of "device busy". That normally means the chroot is still mounted from a previous build. 06:56:00: INFO: Deleting chroot. umount: /b/swarming/w/ir/cache/cbuild/repository/chroot: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) [1;31m06:56:07: ERROR: <class 'chromite.lib.cros_build_lib.RunCommandError'>: return code: 1; command: sudo -n 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache' 'CROS_SUDO_KEEP_ALIVE=unknown' -- umount -d /b/swarming/w/ir/cache/cbuild/repository/chroot cmd=['sudo', '-n', 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache', 'CROS_SUDO_KEEP_ALIVE=unknown', '--', 'umount', '-d', '/b/swarming/w/ir/cache/cbuild/repository/chroot'] Traceback (most recent call last): File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/parallel.py", line 441, in _Run self._task(*self._task_args, **self._task_kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/build_stages.py", line 97, in _DeleteChroot cros_sdk_lib.CleanupChrootMount(chroot, delete=True) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/timeout_util.py", line 191, in TimeoutWrapper return func(*args, **kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_sdk_lib.py", line 394, in CleanupChrootMount osutils.UmountTree(chroot) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/osutils.py", line 902, in UmountTree UmountDir(mount_pt, lazy=False, cleanup=False) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/osutils.py", line 864, in UmountDir runcmd(cmd, print_cmd=False) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_build_lib.py", line 322, in SudoRunCommand return RunCommand(sudo_cmd, **kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_build_lib.py", line 647, in RunCommand raise RunCommandError(msg, cmd_result) RunCommandError: return code: 1; command: sudo -n 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache' 'CROS_SUDO_KEEP_ALIVE=unknown' -- umount -d /b/swarming/w/ir/cache/cbuild/repository/chroot cmd=['sudo', '-n', 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache', 'CROS_SUDO_KEEP_ALIVE=unknown', '--', 'umount', '-d', '/b/swarming/w/ir/cache/cbuild/repository/chroot'] [0m 06:56:07: INFO: Translating result <class 'chromite.lib.cros_build_lib.RunCommandError'>: return code: 1; command: sudo -n 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache' 'CROS_SUDO_KEEP_ALIVE=unknown' -- umount -d /b/swarming/w/ir/cache/cbuild/repository/chroot cmd=['sudo', '-n', 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache', 'CROS_SUDO_KEEP_ALIVE=unknown', '--', 'umount', '-d', '/b/swarming/w/ir/cache/cbuild/repository/chroot'] Traceback (most recent call last): File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/parallel.py", line 441, in _Run self._task(*self._task_args, **self._task_kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/build_stages.py", line 97, in _DeleteChroot cros_sdk_lib.CleanupChrootMount(chroot, delete=True) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/timeout_util.py", line 191, in TimeoutWrapper return func(*args, **kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_sdk_lib.py", line 394, in CleanupChrootMount osutils.UmountTree(chroot) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/osutils.py", line 902, in UmountTree UmountDir(mount_pt, lazy=False, cleanup=False) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/osutils.py", line 864, in UmountDir runcmd(cmd, print_cmd=False) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_build_lib.py", line 322, in SudoRunCommand return RunCommand(sudo_cmd, **kwargs) File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_build_lib.py", line 647, in RunCommand raise RunCommandError(msg, cmd_result) RunCommandError: return code: 1; command: sudo -n 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache' 'CROS_SUDO_KEEP_ALIVE=unknown' -- umount -d /b/swarming/w/ir/cache/cbuild/repository/chroot cmd=['sudo', '-n', 'CROS_CACHEDIR=/b/swarming/w/ir/cache/cbuild/repository/.cache', 'CROS_SUDO_KEEP_ALIVE=unknown', '--', 'umount', '-d', '/b/swarming/w/ir/cache/cbuild/repository/chroot'] to fail.
,
Nov 6
This looks like another instance of the same failure: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8930597715232349104
,
Nov 6
Are we really rebooting between runs?
,
Nov 6
I believe so. This is the bot from #2: https://chrome-swarming.appspot.com/bot?id=swarm-cros-386&selected=1&sort_stats=total%3Adesc
,
Dec 6
Two more examples of the same kind of failure: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8927879807405103712 https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8927879866860366144 I took a look at both and confirmed that they had just rebooted before the failure. I believe that there is actually something still using the chroot when we attempt to unmount it. I think the next step is to do what we did for the "tar source files modified while creating" bug: when this fails, run lsof on the chroot and see what processes are accessing it. |
|
►
Sign in to add a comment |
|
Comment 1 by jclinton@google.com
, Oct 24Status: Available (was: Untriaged)