New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 774595 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Oct 2017
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

shared_host_dir_unittest.testHostDirCreationAndCleanup is flaky

Project Member Reported by pprabhu@chromium.org, Oct 13 2017

Issue description

Because SharedHostDir.cleanup is racy:

        # It's possible that the directory is no longer mounted (e.g. if the        
        # system was rebooted), so check before unmounting.                         
        utils.run('if findmnt "%(path)s" > /dev/null;'                              
                  '  then sudo umount "%(path)s";'                                  
                  'fi' %                                                            
                  {'path': self.path})                                              
        utils.run('sudo rm -r "%s"' % self.path)


We can't be sure that the umount completes before the 'sudo rm -r'
Sometimes, the unmount takes a while leading to unittest (and real) failures:

autotest-0.0.2-r8709: 
autotest-0.0.2-r8709: ======================================================================
autotest-0.0.2-r8709: ERROR: testHostDirCreationAndCleanup (autotest_lib.site_utils.lxc.shared_host_dir_unittest.SharedHostDirTests)
autotest-0.0.2-r8709: Verifies that the host dir is properly created and cleaned up when
autotest-0.0.2-r8709: ----------------------------------------------------------------------
autotest-0.0.2-r8709: Traceback (most recent call last):
autotest-0.0.2-r8709:   File "/build/guado_moblab/tmp/portage/chromeos-base/autotest-0.0.2-r8709/work/autotest-0.0.2/site_utils/lxc/shared_host_dir_unittest.py", line 45, in testHostDirCreationAndCleanup
autotest-0.0.2-r8709:     host_dir.cleanup()
autotest-0.0.2-r8709:   File "/build/guado_moblab/tmp/portage/chromeos-base/autotest-0.0.2-r8709/work/autotest-0.0.2/site_utils/lxc/shared_host_dir.py", line 69, in cleanup
autotest-0.0.2-r8709:     utils.run('sudo rm -r "%s"' % self.path)
autotest-0.0.2-r8709:   File "/build/guado_moblab/tmp/portage/chromeos-base/autotest-0.0.2-r8709/work/autotest-0.0.2/client/common_lib/utils.py", line 738, in run
autotest-0.0.2-r8709:     "Command returned non-zero exit status")
autotest-0.0.2-r8709: CmdError: Command <sudo rm -r "/build/guado_moblab/tmp/portage/chromeos-base/autotest-0.0.2-r8709/temp/tmpM5YRyX/host"> failed, rc=1, Command returned non-zero exit status
autotest-0.0.2-r8709: * Command: 
autotest-0.0.2-r8709:     sudo rm -r "/build/guado_moblab/tmp/portage/chromeos-
autotest-0.0.2-r8709:     base/autotest-0.0.2-r8709/temp/tmpM5YRyX/host"
autotest-0.0.2-r8709: Exit status: 1
autotest-0.0.2-r8709: Duration: 0.0058171749115

See the autotest unittest failure here:
https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/3959
 

Comment 1 by kenobi@chromium.org, Oct 13 2017

Is utils.run asynchronous?

Comment 2 by kenobi@chromium.org, Oct 13 2017

To answer my own question: no, but it appears umount is.  =/
After running umount, you should run findmnt with retry + timeout to make sure that unmount is complete before running 'rm -rf'

You already check for findmnt before running umount. All you need is a function _ensure_unmount_complete that checks the same thing with some retries.
Project Member

Comment 4 by bugdroid1@chromium.org, Oct 15 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/560ec187695ac033a6d532a30eeabec9de807d72

commit 560ec187695ac033a6d532a30eeabec9de807d72
Author: Ben Kwa <kenobi@google.com>
Date: Sat Oct 14 02:43:06 2017

Temporarily disable a flakey test.

BUG= chromium:774595 
TEST=shared_host_dir_unittest.py (should skip all tests)

Change-Id: I69e505db4997b4bc3e9fc77c61e122e3f77ff591
Reviewed-on: https://chromium-review.googlesource.com/719438
Commit-Ready: Ben Kwa <kenobi@chromium.org>
Tested-by: Ben Kwa <kenobi@chromium.org>
Reviewed-by: Dan Shi <dshi@google.com>

[modify] https://crrev.com/560ec187695ac033a6d532a30eeabec9de807d72/site_utils/lxc/shared_host_dir_unittest.py

Project Member

Comment 5 by bugdroid1@chromium.org, Oct 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/39331d8b0ea2aa829165fea3eef9d57136fac28f

commit 39331d8b0ea2aa829165fea3eef9d57136fac28f
Author: Ben Kwa <kenobi@google.com>
Date: Wed Oct 25 00:36:32 2017

Fix flakey shared host dir behaviour.

It appears that the kernel does not always guarantee that mount points can be
removed immediately after unmounting.  Add retry/timeout code to handle cases
where this occurs.

BUG= chromium:774595 
TEST=shared_host_dir_unittest.py -v

Change-Id: Iabaea8cf507684535806c0901dc98699e836a716
Reviewed-on: https://chromium-review.googlesource.com/722113
Commit-Ready: Ben Kwa <kenobi@chromium.org>
Tested-by: Ben Kwa <kenobi@chromium.org>
Reviewed-by: Ben Kwa <kenobi@chromium.org>

[modify] https://crrev.com/39331d8b0ea2aa829165fea3eef9d57136fac28f/site_utils/lxc/shared_host_dir.py
[modify] https://crrev.com/39331d8b0ea2aa829165fea3eef9d57136fac28f/site_utils/lxc/shared_host_dir_unittest.py

Comment 6 by kenobi@chromium.org, Oct 25 2017

Status: Fixed (was: Untriaged)

Sign in to add a comment