memory-infra never worked on ref build with win-high-dpi bot |
|||||||||
Issue descriptionWe are seeing "regressions" in some test cases, but there's no ref rests to compare with. Example link: https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgIDg9qTp6AsM
,
Jun 19 2017
Not sure they have ever been passing: https://build.chromium.org/p/chromium.perf/builders/Win%2010%20High-DPI%20Perf/builds/717/steps/media.tough_video_cases_tbmv2.reference%20on%20Intel%20GPU%20on%20Windows%20on%20Windows-10-10240 Traceback (most recent call last): File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py", line 99, in _RunStoryAndProcessErrorIfNeeded state.RunStory(results) File "c:\b\s\w\ir\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 75, in traced_function return func(*args, **kwargs) File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\page\shared_page_state.py", line 296, in RunStory self._current_page.Run(self) File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\page\__init__.py", line 112, in Run self.RunPageInteractions(action_runner) File "c:\b\s\w\ir\tools\perf\page_sets\tough_video_cases.py", line 78, in RunPageInteractions self.PlayAction(action_runner) File "c:\b\s\w\ir\tools\perf\page_sets\tough_video_cases.py", line 46, in PlayAction action_runner.MeasureMemory() File "c:\b\s\w\ir\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 75, in traced_function return func(*args, **kwargs) File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\actions\action_runner.py", line 160, in MeasureMemory raise exceptions.StoryActionError('Unable to obtain memory dump') StoryActionError: Unable to obtain memory dump
,
Jun 19 2017
https://github.com/catapult-project/catapult/blob/05e3d606dd506f979ba545e7d1459a7cee1afa34/common/py_utils/py_utils/chrome_binaries.json "win_AMD64": { "cloud_storage_hash": "2348f9bcf421fa4739493a12b4c8e3210a528d84", "download_path": "bin\\reference_build\\chrome-win64-pgo.zip", "path_within_archive": "chrome-win64-pgo\\chrome.exe", "version_in_cs": "58.0.3004.0" },
,
Jun 19 2017
On the reference build, TBMv2 memory metrics are failing with multiple benchmarks on Win 10 High-DPI. For example, https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_High-DPI_Perf%2F717%2F%2B%2Frecipes%2Fsteps%2Fmedia.tough_video_cases_tbmv2.reference_on_Intel_GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_High-DPI_Perf%2F717%2F%2B%2Frecipes%2Fsteps%2Fmemory.desktop.reference_on_Intel_GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_High-DPI_Perf%2F717%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.memory_desktop.reference_on_Intel_GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_High-DPI_Perf%2F717%2F%2B%2Frecipes%2Fsteps%2Fmedia.tough_video_cases_tbmv2.reference_on_Intel_GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout The issues doesn't appear to be specific to media benchmarks. Has there been some recent changes in memory metrics that make it incompatible with older Chrome builds? Is it time to update the reference build?
,
Jun 19 2017
Hector: can you help triaging the memory metrics problem?
,
Jun 20 2017
Seems we have had similar failures on that bot since at least Feb 23rd: https://build.chromium.org/p/chromium.perf/builders/Win%2010%20High-DPI%20Perf/builds/345/steps/system_health.memory_desktop.reference%20on%20Intel%20GPU%20on%20Windows%20on%20Windows-10-10240. Previously to this the ref system_health desktop benchmark was disabled: (charilea re-enabled on the 22nd: https://chromium.googlesource.com/chromium/src/+/ce14e3803fdecf1c65a6bdfe37e58299b4779db0). I've been unable to find any example where we succeeded getting metrics from the reference build of that bot and the dashboard seems to agree that this never worked: https://chromeperf.appspot.com/report?sid=af23f03c9983c8bcc84c3025176324cfd6c7ec37787793e6234e2bbfff4890e6 (ref bottom, ToT top, 1 is the expected result) as mentioned above in #2. Of all the windows bots where I could load load_news_bbc_ref only win-high-dpi had this problem: https://chromeperf.appspot.com/report?sid=82e2851d922bf249a2969621faa2093b87fe04e3e8bc254f2d475a906257dd6d&start_rev=414703&end_rev=480703&rev=480703 The error message suggests the problem is actually in Chrome not in telemetry or traceviewer. Examining two uploaded traces: Ref: https://hjd.users.x20web.corp.google.com/www/2017-06-20-ref-build-not-working-crbug734627/load_news_bbc_ref.html ToT: https://hjd.users.x20web.corp.google.com/www/2017-06-20-ref-build-not-working-crbug734627/load_news_bbc_tot.html The ref build is missing any information about the GPU process which seems bad. Comparing the ref build from a non-memory system health benchmark: Non-memory ref: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_9-2017-06-20_05-22-43-91896.html We see that it still has information about the GPU. This suggests memory-infra is doing something pretty bad to the GPU process :( On the plus side whatever the problem is seems to have been fixed (since ToT works). On the downside I haven't been able to find a bug that looks likely. It seems like we just hit some super edge case bug only on this specific config :( I think we won't be able to fix this without rolling the builds.
,
Jun 20 2017
,
Jun 21 2017
It's a known issue that ref builds do not work on Windows (for the same reason as https://github.com/catapult-project/catapult/issues/2610). That's why the benchmark was disabled on reference; and it only got re-enabled when failures on reference builds were treated as non-fatal (orange). The fix is to update the reference build.
,
Oct 3 2017
What's the next step here?
,
Jan 12 2018
,
Apr 23 2018
Is there any update for this bug?
,
Apr 23 2018
,
Apr 25 2018
It looks like the ref build has been updated (https://github.com/catapult-project/catapult/blob/master/common/py_utils/py_utils/chrome_binaries.json#L77) and we've been seeing ref build data since: 2017/08. See https://chromeperf.appspot.com/report?sid=a1c5eb216fa723bb2ce7cfe4bd5984e5adc24c9ad87c65ba0ba74341980ae852&start_rev=474842&end_rev=549949 |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by crouleau@chromium.org
, Jun 19 2017