New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 675667 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Last visit > 30 days ago
Closed: Dec 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

site_utils/dut_status_unittest.py is inherently flaky

Project Member Reported by diand...@chromium.org, Dec 19 2016

Issue description

As can be seen in:

  https://uberchromegw.corp.google.com/i/chromeos/builders/stout-paladin/builds/20949/steps/UnitTest/logs/stdio

We sometimes see a failure:

autotest-0.0.2-r7796: Running autotest_lib.site_utils.dut_status_unittest
autotest-0.0.2-r7796: test_all_options (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test that all three options are a fatal error. ... FATAL: Can specify at most two of --since, --until, and --duration
autotest-0.0.2-r7796: ok
autotest-0.0.2-r7796: test_default_time_bounds (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds when no options are supplied. ... ok
autotest-0.0.2-r7796: test_duration_only (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds with --duration only. ... ok
autotest-0.0.2-r7796: test_end_and_duration (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds with --until and --duration. ... ok
autotest-0.0.2-r7796: test_end_only (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds with --until only. ... ok
autotest-0.0.2-r7796: test_start_and_duration (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds with --since and --duration. ... ok
autotest-0.0.2-r7796: test_start_and_end (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds with --since and --until. ... ok
autotest-0.0.2-r7796: test_start_only (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds with --since only. ... FAIL
autotest-0.0.2-r7796: 
autotest-0.0.2-r7796: ======================================================================
autotest-0.0.2-r7796: FAIL: test_start_only (autotest_lib.site_utils.dut_status_unittest.TimeOptionTests)
autotest-0.0.2-r7796: Test time bounds with --since only.
autotest-0.0.2-r7796: ----------------------------------------------------------------------
autotest-0.0.2-r7796: Traceback (most recent call last):
autotest-0.0.2-r7796:   File "/build/stout/tmp/portage/chromeos-base/autotest-0.0.2-r7796/work/autotest-0.0.2/site_utils/dut_status_unittest.py", line 68, in test_start_only
autotest-0.0.2-r7796:     self._check_default_end(arguments, end_time)
autotest-0.0.2-r7796:   File "/build/stout/tmp/portage/chromeos-base/autotest-0.0.2-r7796/work/autotest-0.0.2/site_utils/dut_status_unittest.py", line 45, in _check_default_end
autotest-0.0.2-r7796:     assert arguments.until <= end_time
autotest-0.0.2-r7796: AssertionError
autotest-0.0.2-r7796: 
autotest-0.0.2-r7796: ----------------------------------------------------------------------
autotest-0.0.2-r7796: Ran 8 tests in 0.013s
autotest-0.0.2-r7796: 
autotest-0.0.2-r7796: FAILED (failures=1)
autotest-0.0.2-r7796: Traceback (most recent call last):
autotest-0.0.2-r7796:   File "./utils/unittest_suite.py", line 287, in <module>
autotest-0.0.2-r7796:     main()
autotest-0.0.2-r7796:   File "./utils/unittest_suite.py", line 274, in main
autotest-0.0.2-r7796:     errors = find_and_run_tests(absolute_start, options)
autotest-0.0.2-r7796:   File "./utils/unittest_suite.py", line 257, in find_and_run_tests
autotest-0.0.2-r7796:     pe.run_until_completion()
autotest-0.0.2-r7796:   File "/build/stout/tmp/portage/chromeos-base/autotest-0.0.2-r7796/work/autotest-0.0.2/utils/parallel.py", line 67, in run_until_completion
autotest-0.0.2-r7796:     self._run(self.ready_to_run.pop())
autotest-0.0.2-r7796:   File "/build/stout/tmp/portage/chromeos-base/autotest-0.0.2-r7796/work/autotest-0.0.2/utils/parallel.py", line 53, in _run
autotest-0.0.2-r7796:     function()
autotest-0.0.2-r7796:   File "./utils/unittest_suite.py", line 247, in <lambda>
autotest-0.0.2-r7796:     run_module = lambda module=module_names: run_test(module, options)
autotest-0.0.2-r7796:   File "./utils/unittest_suite.py", line 152, in run_test
autotest-0.0.2-r7796:     raise TestFailure(msg)
autotest-0.0.2-r7796: __main__.TestFailure: autotest_lib.site_utils.dut_status_unittest had 1 failures and 0 errors.
autotest-0.0.2-r7796: Running autotest_lib.site_utils.lxc_config_unittest

===

Upon inspection this test appears to be inherently flaky.  Specifically it will flake whenever 1 second ticks during the test running.  This is because the test starts out doing:

  end_time = int(time.time())

...and then it later confirms that this result is <= int(time.time())

If the first call to time.time() gives 12345.9999 and the second call 123456.0001 then the test will fail.

===

This can be easily shown by running the test with a patch that looks like this to force the issue:

diff --git a/site_utils/dut_status_unittest.py b/site_utils/dut_status_unittest.py
index 42189c00e194..ecd697617da6 100644
--- a/site_utils/dut_status_unittest.py
+++ b/site_utils/dut_status_unittest.py
@@ -60,7 +60,12 @@ class TimeOptionTests(unittest.TestCase):
         Also tests that --since and -s are equivalent.
 
         """
-        end_time = int(time.time())
+        while True:
+            t = time.time()
+            if (t - int(t)) > .999:
+                break
+
+        end_time = int(t)
         start_time = end_time - 3600
         start_time_string = time_utils.epoch_time_to_date_string(start_time)
         for option in ['--since', '-s']:

===

I ran only this test by hacking:

diff --git a/utils/unittest_suite.py b/utils/unittest_suite.py
index 7cc9d98a7374..f54253ca3ffa 100755
--- a/utils/unittest_suite.py
+++ b/utils/unittest_suite.py
@@ -182,6 +182,8 @@ def scan_for_modules(start, options):
 
         # Look for unittest files.
         for file_name in file_names:
+            if not file_name.endswith('dut_status_unittest.py'):
+                continue
             if (file_name.endswith('_unittest.py') or
                 file_name.endswith('_test.py')):
                 file_path = os.path.join(dir_path, file_name)

...and then in the chroot running:

./utils/unittest_suite.py  --debug

===

I have zero idea what these tests are intended to do, but it is clear that they shouldn't randomly fail for no reason.  Assigning based on the tar baby approach.
 
Status: Assigned (was: Untriaged)
> This can be easily shown by running the test with a
> patch that looks like this to force the issue:

The timing assertion was meant to work no matter how long
was spent between the time that the test called time.time()
and the function under test made the same call.  So, the
flake is a logical bug in the test.

Five'll get you 10 that the proper fix is to change "<=" to
">=" in the assertion.  Except that I'm not in a betting mood,
given that I've already gotten it wrong once...

The following patch reliably demonstrates the bug (and finds all
the test failures):

diff --git a/site_utils/dut_status.py b/site_utils/dut_status.py
index 3c4d71d..d79c72a 100755
--- a/site_utils/dut_status.py
+++ b/site_utils/dut_status.py
@@ -237,6 +237,7 @@ def _validate_time_range(arguments):
                      ArgumentParser.parse_args().
 
     """
+    time.sleep(2.0)
     if (arguments.duration is not None and
             arguments.since is not None and arguments.until is not None):
         print >>sys.stderr, ('FATAL: Can specify at most two of '

Status: Started (was: Assigned)
Project Member

Comment 4 by bugdroid1@chromium.org, Dec 21 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/73fa120db9246f7236ea669dd5c8b0c95c04a8fa

commit 73fa120db9246f7236ea669dd5c8b0c95c04a8fa
Author: Richard Barnette <jrbarnette@chromium.org>
Date: Mon Dec 19 21:55:24 2016

[autotest] Fix dut_status_unittest time comparisons.

The logic for unit testing `dut-status --until` was wrong.  It was
supposed to test the "--until" value was "now" in the sense that
it was in-between two timestamps that bracketed the calculation of
the value.  One of the comparisons was for "<=" instead of ">=",
leading to occasional flake.

This changes to use a mock to force `time.time()` to return a
pre-computed value.

This change also contains other miscellaneous (non-functional)
clean-up of the unit tests.

BUG= chromium:675667 
TEST=run the unit tests using a patch to provoke the bug.

Change-Id: I3b3a6fc10d7715cf83103f67e272ccc3a216efb0
Reviewed-on: https://chromium-review.googlesource.com/422427
Commit-Ready: Richard Barnette <jrbarnette@chromium.org>
Tested-by: Richard Barnette <jrbarnette@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/73fa120db9246f7236ea669dd5c8b0c95c04a8fa/site_utils/dut_status_unittest.py

Status: Fixed (was: Started)

Comment 6 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 7 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 8 by dchan@google.com, May 30 2017

Labels: VerifyIn-60

Comment 9 by dchan@chromium.org, Aug 1 2017

Labels: VerifyIn-61

Comment 10 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment