System_health smoke tests are super flaky on Windows platform |
|||
Issue descriptionAlmost all the system health smoke tests are flaky on Win platforms: http://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=telemetry_perf_unittests&builder=chromium.win%3AWin%207%20Tests%20x64%20(1) http://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=telemetry_perf_unittests&builder=chromium.win%3AWin7%20Tests%20(1) Looking at the log in an example run (https://chromium-swarm.appspot.com/task?id=33cfb54968e92d10&refresh=10&show_raw=1), the flakes seem to be caused by failure to run "clear_system_cache.exe": [9/28] benchmarks.system_health_smoke_test.SystemHealthBenchmarkSmokeTest.system_health.memory_desktop.load:news:flipboard failed unexpectedly 34.2270s: [ RUN ] load:news:flipboard Traceback (most recent call last): File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\telemetry\telemetry\internal\story_runner.py", line 87, in _RunStoryAndProcessErrorIfNeeded state.WillRunStory(story) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 52, in traced_function return func(*args, **kwargs) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\telemetry\telemetry\page\shared_page_state.py", line 214, in WillRunStory self._StartBrowser(page) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 52, in traced_function return func(*args, **kwargs) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\telemetry\telemetry\page\shared_page_state.py", line 174, in _StartBrowser self._browser = self._possible_browser.Create(self._finder_options) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\telemetry\telemetry\internal\backends\chrome\desktop_browser_finder.py", line 68, in Create browser_backend, self._platform_backend, self._credentials_path) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\telemetry\telemetry\internal\browser\browser.py", line 50, in __init__ self._browser_backend.browser_directory) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\telemetry\telemetry\core\platform.py", line 185, in FlushSystemCacheForDirectory return self._platform_backend.FlushSystemCacheForDirectory(directory) File "e:\b\swarm_slave\w\irtfvnfx\third_party\catapult\telemetry\telemetry\internal\platform\desktop_platform_backend.py", line 24, in FlushSystemCacheForDirectory subprocess.check_call([flush_command, '--recurse', directory]) File "e:\b\depot_tools\python276_bin\lib\subprocess.py", line 540, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command '[u'e:\\b\\swarm_slave\\w\\irtfvnfx\\third_party\\catapult\\telemetry\\telemetry\\internal\\bin\\win\\AMD64\\clear_system_cache.exe', '--recurse', 'e:\\b\\swarm_slave\\w\\irtfvnfx\\out\\Release']' returned non-zero exit status -2147483645 [ FAILED ] load:news:flipboard (8919 ms) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] load:news:flipboard 1 FAILED TEST I think the next step to verify this theory is only run telemetry_perf_unittests with "--jobs=1" flag on Windows bot & see if it fixes the flakiness. If it does, we may consider not running Telemetry tests in parallel on Windows.
,
Mar 31 2017
If we are running that command in parallel across stories, it sounds like a bad idea: CalledProcessError: Command '[u'e:\\b\\swarm_slave\\w\\irtfvnfx\\third_party\\catapult\\telemetry\\telemetry\\internal\\bin\\win\\AMD64\\clear_system_cache.exe', '--recurse', 'e:\\b\\swarm_slave\\w\\irtfvnfx\\out\\Release']' returned non-zero exit status -2147483645
,
Mar 31 2017
Thanks. It should be easy enough to disable system cache for smoke testing.
,
Apr 24 2018
What's the status of this bug?
,
May 3 2018
Closing this as Archived - realistically, this would have gotten more attention since Mar 31 2017 if it were still a big problem. |
|||
►
Sign in to add a comment |
|||
Comment 1 by etienneb@chromium.org
, Mar 30 2017