New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 773510 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

telemetry_perf_unittests failing on chromium.mac/Mac10.11 Tests

Project Member Reported by scheib@chromium.org, Oct 10 2017

Issue description

telemetry_perf_unittests failing on chromium.mac/Mac10.11 Tests

Builders failed on: 
- Mac10.11 Tests: 
  https://build.chromium.org/p/chromium.mac/builders/Mac10.11%20Tests

First failing build:
https://uberchromegw.corp.google.com/i/chromium.mac/builders/Mac10.11%20Tests/builds/19094

Change range:
https://chromium.googlesource.com/chromium/src/+log/ddcdc39dadb3fb9ef16cf0d5254dd9eb9312bdc3%5E..a3f957741654460009f09a80f152ab7713ce1123?pretty=fuller&n=

No change looked related to me.
This would have, except that it appears to be fuchsia specific:
tools/fuchsia/run-swarmed.py
https://chromium.googlesource.com/chromium/src/+/a3f957741654460009f09a80f152ab7713ce1123%5E%21/#F0
"""
fuchsia: Improve swarm runner script

- Use multiprocessing to trigger/collect jobs more quickly
- Handle bot errors separate from run failures
- Improve/tersify output
"""

Failure:
"shard #6 timed out, took too much time to complete"
"""
[1/11] benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.browse:media:imgur failed unexpectedly 209.3432s:
  [  PASSED  ] 0 tests.
  
  Downloading WPR archives. This can take a long time.
  All WPR archives are downloaded, took 7.10377001762 seconds.
  Downloading WPR archives. This can take a long time.
  All WPR archives are downloaded, took 88.0626888275 seconds.
  Chrome build location for mac_x86_64 not found. Browser will be run without Flash.
  Chrome build location for mac_x86_64 not found. Browser will be run without Flash.
  Chrome build location for mac_x86_64 not found. Browser will be run without Flash.
  Downloading WPR archives. This can take a long time.
  All WPR archives are downloaded, took 113.351649046 seconds.
  Try printing formatted exception: None None None
  
  Traceback (most recent call last):
    RunBenchmark at /b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py:334
      None
    Run at /b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py:165
      None
    _UpdateAndCheckArchives at /b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py:388
      None
    DownloadArchivesIfNeeded at /b/s/w/ir/third_party/catapult/telemetry/telemetry/wpr/archive_info.py:107
      download_if_needed(archive_path)
    download_if_needed at /b/s/w/ir/third_party/catapult/telemetry/telemetry/wpr/archive_info.py:86
      cloud_storage.GetIfChanged(path, self._bucket)
    GetIfChanged at /b/s/w/ir/third_party/catapult/common/py_utils/py_utils/cloud_storage.py:423
      None
    __exit__ at /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py:24
      self.gen.next()
    _FileLock at /b/s/w/ir/third_party/catapult/common/py_utils/py_utils/cloud_storage.py:273
      None
    WaitFor at /b/s/w/ir/third_party/catapult/common/py_utils/py_utils/__init__.py:148
      None
  TimeoutException: Timed out while waiting 10s for <lambda>.
  
  Locals:
    GetConditionString       : <function GetConditionString at 0x1081f2de8>
    condition                : <function <lambda> at 0x1081f2e60>
    elapsed_time             : 10.08633804321289
    last_output_elapsed_time : 10.08633804321289
    last_output_time         : 1507667053.694131
    now                      : 1507667063.780469
    poll_interval            : 1.008633804321289
    res                      : False
    start_time               : 1507667053.694131
    timeout                  : 10
  
  Traceback (most recent call last):
    File "/b/s/w/ir/tools/perf/benchmarks/system_health_smoke_test.py", line 124, in RunTest
      msg='Failed: %s' % benchmark_class)
  AssertionError: Failed: <class 'benchmarks.system_health.DesktopMemorySystemHealth'>
"""
 

Comment 1 by engedy@chromium.org, Oct 13 2017

Cc: primiano@chromium.org
Labels: -Sheriff-Chromium Performance-Sheriff-BotHealth OS-Mac Pri-2 Type-Bug
Owner: dtu@chromium.org
Status: Assigned (was: Untriaged)
This looks quit flaky. The problem went away after 4 builds, but I already found happening at build #19026 (the same error message), so I say the blame range is unclear.

See:
https://chromium-swarm.appspot.com/task?id=391750ccecc9fb10&refresh=10&show_raw=1

Tentatively reassigning to performance (bot) sheriffs, as I don't think it's actionable by off-the-shelf generic sheriffs.

Comment 2 by engedy@chromium.org, Oct 13 2017

Sorry, bunch of typos: quite*, found it*, I'd*

Comment 3 by dtu@chromium.org, Oct 13 2017

Cc: nedngu...@google.com
+nednguyen, SystemHealthBenchmarkSmokeTest does seem quite flaky on this platform. I see a number of failures in the Mac10.11 Tests history.

Sample run:
https://chromium-swarm.appspot.com/task?id=392efb375d562010
Status: WontFix (was: Assigned)
No longer happening

Comment 5 by msw@chromium.org, Oct 23 2017

Status: Assigned (was: WontFix)
This seems to still be happening:
https://uberchromegw.corp.google.com/i/chromium.mac/builders/Mac10.11%20Tests/builds/19506

[3/10] benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.load:news:qq failed unexpectedly 11.2861s:
  Traceback (most recent call last):
    File "/b/s/w/ir/tools/perf/benchmarks/system_health_smoke_test.py", line 110, in RunTest
      story_set = benchmark_class().CreateStorySet(options)
    File "/b/s/w/ir/tools/perf/benchmarks/system_health.py", line 113, in CreateStorySet
      take_memory_measurement=True)
    File "/b/s/w/ir/tools/perf/page_sets/system_health/system_health_stories.py", line 33, in __init__
      self.AddStory(story_class(self, take_memory_measurement))
    File "/b/s/w/ir/tools/perf/page_sets/system_health/accessibility_stories.py", line 25, in __init__
      extra_browser_args=['--force-renderer-accessibility'])
    File "/b/s/w/ir/tools/perf/page_sets/system_health/system_health_story.py", line 71, in __init__
      extra_browser_args=extra_browser_args)
    File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/page/__init__.py", line 47, in __init__
    File "/b/s/w/ir/third_party/catapult/common/py_utils/py_utils/cloud_storage.py", line 413, in GetIfChanged
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
      return self.gen.next()
    File "/b/s/w/ir/third_party/catapult/common/py_utils/py_utils/cloud_storage.py", line 266, in _FileLock
    File "/b/s/w/ir/third_party/catapult/common/py_utils/py_utils/__init__.py", line 148, in WaitFor
  TimeoutException: Timed out while waiting 10s for <lambda>.
Failed to delete /b/s/w/ir (3 files remaining).
  Maybe the test has a subprocess outliving it.
  Sleeping 2 seconds.
Succeeded.
Cc: -nedngu...@google.com perezju@chromium.org
Owner: nedngu...@google.com
The starting error here is:

[3/26] benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.browse:media:pinterest failed unexpectedly 0.0000s:
  Failed to load "benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.browse:media:pinterest" in run_one_test
    loadTestsFromName("benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.browse:media:pinterest") failed: 'module' object has no attribute 'SystemHealthBenchmarkSmokeTest'
    Traceback (most recent call last):
      File "/b/s/w/ir/third_party/catapult/third_party/typ/typ/runner.py", line 853, in _run_one_test
        suite = child.loader.loadTestsFromName(test_name)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/unittest/loader.py", line 100, in loadTestsFromName
        parent, obj = obj, getattr(obj, part)
    AttributeError: 'module' object has no attribute 'SystemHealthBenchmarkSmokeTest'
    
    
    load_via_load_tests("benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.browse:media:pinterest") returned 0 tests

(from https://chromium-swarm.appspot.com/task?id=39613f5598761310&refresh=10&show_raw=1)

I think this has to do with the logic of generating tests in SystemHealthBenchmarkSmokeTest
Cc: dpranke@chromium.org
Dirk:
system_health_smoke_test generate it test cases through https://cs.chromium.org/chromium/src/tools/perf/benchmarks/system_health_smoke_test.py?rcl=1dea5ee048fa7c57e2af0c47ac989ba5b4b25a64&l=163 

What would make typ fail to load "benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.browse:media:pinterest" flakily? Does typ invoke load_tests() once before running all the tests in parallel, or does it invoke this for every test process?

If this is the latter, we probably can fix the flake by just switching telemetry_perf_unittests on  chromium.mac/Mac10.11 Tests to use --jobs=1 (related to issue 753495)
Cc: nednguyen@chromium.org phajdan.jr@chromium.org erikc...@chromium.org
 Issue 739455  has been merged into this issue.
It does both, first to load the full list of tests, and then second to load each individual test in the appropriate worker subprocess. Using --jobs=1 might help, but I'm not 100% sure (we should still call load_tests() twice).
Status: WontFix (was: Assigned)
This has not happened in the last 100 builds. Suggest it's because my switch to "--jobs=1" (issue 753495)

Sign in to add a comment