I first saw this on the win-high-dpi bot on Friday and, while I've only seen it on Windows thus far, I have no reason to believe that it couldn't also occur on other platforms.
Link to logs: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_High-DPI_Perf%2F192%2F%2B%2Frecipes%2Fsteps%2Fbattor.power_cases_on_Intel_GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout
Relevant callstack:
"""
Traceback (most recent call last):
File "c:\b\s\w\irlcwumq\third_party\catapult\telemetry\telemetry\internal\story_runner.py", line 86, in _RunStoryAndProcessErrorIfNeeded
test.WillRunStory(state.platform)
File "c:\b\s\w\irlcwumq\third_party\catapult\telemetry\telemetry\web_perf\timeline_based_measurement.py", line 285, in WillRunStory
platform.tracing_controller.StartTracing(self._tbm_options.config)
File "c:\b\s\w\irlcwumq\third_party\catapult\telemetry\telemetry\core\tracing_controller.py", line 43, in StartTracing
self._tracing_controller_backend.StartTracing(tracing_config, timeout)
File "c:\b\s\w\irlcwumq\third_party\catapult\telemetry\telemetry\internal\platform\tracing_controller_backend.py", line 88, in StartTracing
if agent.StartAgentTracing(config, timeout):
File "c:\b\s\w\irlcwumq\third_party\catapult\telemetry\telemetry\internal\platform\tracing_agent\battor_tracing_agent.py", line 73, in StartAgentTracing
self._battor.StartTracing()
File "c:\b\s\w\irlcwumq\third_party\catapult\common\battor\battor\battor_wrapper.py", line 215, in StartTracing
self._FlashBattOr()
File "c:\b\s\w\irlcwumq\third_party\catapult\common\battor\battor\battor_wrapper.py", line 145, in _FlashBattOr
'battor_firmware', 'default')
File "c:\b\s\w\irlcwumq\third_party\catapult\dependency_manager\dependency_manager\manager.py", line 93, in FetchPathWithVersion
path = dependency_info.GetRemotePath()
File "c:\b\s\w\irlcwumq\third_party\catapult\dependency_manager\dependency_manager\dependency_info.py", line 84, in GetRemotePath
return self._cloud_storage_info.GetRemotePath()
File "c:\b\s\w\irlcwumq\third_party\catapult\dependency_manager\dependency_manager\cloud_storage_info.py", line 80, in GetRemotePath
self._cs_hash)
File "c:\b\s\w\irlcwumq\third_party\catapult\common\py_utils\py_utils\cloud_storage.py", line 329, in GetIfHashChanged
with _FileLock(download_path):
File "c:\b\depot_tools\python276_bin\lib\contextlib.py", line 17, in __enter__
return self.gen.next()
File "c:\b\s\w\irlcwumq\third_party\catapult\common\py_utils\py_utils\cloud_storage.py", line 246, in _FileLock
PSEUDO_LOCK_ACQUISITION_TIMEOUT)
File "c:\b\s\w\irlcwumq\third_party\catapult\common\py_utils\py_utils\__init__.py", line 132, in WaitFor
(timeout, GetConditionString()))
TimeoutException: Timed out while waiting 10s for py_utils.WaitFor(lambda: not os.path.exists(pseudo_lock_path),
PSEUDO_LOCK_ACQUISITION_TIMEOUT).
INFO:root:Try printing formatted exception: None None None
Exception raised when cleaning story run:
Traceback (most recent call last):
_RunStoryAndProcessErrorIfNeeded at c:\b\s\w\irlcwumq\third_party\catapult\telemetry\telemetry\internal\story_runner.py:113
state.DidRunStory(results)
traced_function at c:\b\s\w\irlcwumq\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py:75
return func(*args, **kwargs)
DidRunStory at c:\b\s\w\irlcwumq\third_party\catapult\telemetry\telemetry\page\shared_page_state.py:155
if self._current_page.credentials and self._did_login_for_current_page:
AttributeError: 'NoneType' object has no attribute 'credentials'
"""
Basically, it looks like somehow the BattOr firmware lock file isn't being released, which is causing the benchmark to time out while waiting for it. I believe the current timeout is set to 10 seconds, but because the firmware isn't large and doesn't take long to download, I don't anticipate increasing that timeout would help at all. I think what might be happening is that somehow an old lock file isn't getting cleaned up (possible if Telemetry is getting killed in the middle of downloading, for example).
I believe what should happen is that we should have these lock files cleaned up between benchmark runs. Ned suggested that, in the swarming world, if we put the lock file in /tmp, it should get cleaned up in this way. I'm going to verify with maruel@ before proceeding with the fix.
Comment 1 by bugdroid1@chromium.org
, Jan 26 2017