New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 734627 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

memory-infra never worked on ref build with win-high-dpi bot

Project Member Reported by xhw...@chromium.org, Jun 19 2017

Issue description

We are seeing "regressions" in some test cases, but there's no ref rests to compare with.

Example link:

https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgIDg9qTp6AsM
 
Cc: crouleau@chromium.org
Owner: johnchen@chromium.org
Not sure they have ever been passing:

https://build.chromium.org/p/chromium.perf/builders/Win%2010%20High-DPI%20Perf/builds/717/steps/media.tough_video_cases_tbmv2.reference%20on%20Intel%20GPU%20on%20Windows%20on%20Windows-10-10240

Traceback (most recent call last):
  File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py", line 99, in _RunStoryAndProcessErrorIfNeeded
    state.RunStory(results)
  File "c:\b\s\w\ir\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\page\shared_page_state.py", line 296, in RunStory
    self._current_page.Run(self)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\page\__init__.py", line 112, in Run
    self.RunPageInteractions(action_runner)
  File "c:\b\s\w\ir\tools\perf\page_sets\tough_video_cases.py", line 78, in RunPageInteractions
    self.PlayAction(action_runner)
  File "c:\b\s\w\ir\tools\perf\page_sets\tough_video_cases.py", line 46, in PlayAction
    action_runner.MeasureMemory()
  File "c:\b\s\w\ir\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\actions\action_runner.py", line 160, in MeasureMemory
    raise exceptions.StoryActionError('Unable to obtain memory dump')
StoryActionError: Unable to obtain memory dump
https://github.com/catapult-project/catapult/blob/05e3d606dd506f979ba545e7d1459a7cee1afa34/common/py_utils/py_utils/chrome_binaries.json        

"win_AMD64": {
          "cloud_storage_hash": "2348f9bcf421fa4739493a12b4c8e3210a528d84",
          "download_path": "bin\\reference_build\\chrome-win64-pgo.zip",
          "path_within_archive": "chrome-win64-pgo\\chrome.exe",
          "version_in_cs": "58.0.3004.0"
        },
Cc: perezju@chromium.org nedngu...@google.com
Owner: hjd@chromium.org
Hector: can you help triaging the memory metrics problem?

Comment 6 by hjd@chromium.org, Jun 20 2017

Status: Started (was: Available)
Seems we have had similar failures on that bot since at least Feb 23rd: https://build.chromium.org/p/chromium.perf/builders/Win%2010%20High-DPI%20Perf/builds/345/steps/system_health.memory_desktop.reference%20on%20Intel%20GPU%20on%20Windows%20on%20Windows-10-10240.
Previously to this the ref system_health desktop benchmark was disabled:
(charilea re-enabled on the 22nd: https://chromium.googlesource.com/chromium/src/+/ce14e3803fdecf1c65a6bdfe37e58299b4779db0).

I've been unable to find any example where we succeeded getting metrics from the reference build of that
bot and the dashboard seems to agree that this never worked: https://chromeperf.appspot.com/report?sid=af23f03c9983c8bcc84c3025176324cfd6c7ec37787793e6234e2bbfff4890e6 (ref bottom, ToT top, 1 is the expected result) as mentioned above in #2.

Of all the windows bots where I could load load_news_bbc_ref only win-high-dpi had this problem: https://chromeperf.appspot.com/report?sid=82e2851d922bf249a2969621faa2093b87fe04e3e8bc254f2d475a906257dd6d&start_rev=414703&end_rev=480703&rev=480703

The error message suggests the problem is actually in Chrome not in telemetry or traceviewer.

Examining two uploaded traces:
Ref: https://hjd.users.x20web.corp.google.com/www/2017-06-20-ref-build-not-working-crbug734627/load_news_bbc_ref.html
ToT: https://hjd.users.x20web.corp.google.com/www/2017-06-20-ref-build-not-working-crbug734627/load_news_bbc_tot.html

The ref build is missing any information about the GPU process which seems bad.

Comparing the ref build from a non-memory system health benchmark:
Non-memory ref: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_9-2017-06-20_05-22-43-91896.html

We see that it still has information about the GPU.
This suggests memory-infra is doing something pretty bad to the GPU process :(

On the plus side whatever the problem is seems to have been fixed (since ToT works).
On the downside I haven't been able to find a bug that looks likely.

It seems like we just hit some super edge case bug only on this specific config :(

I think we won't be able to fix this without rolling the builds.


Comment 7 by hjd@chromium.org, Jun 20 2017

Summary: memory-infra never worked on ref build with win-high-dpi bot (was: No ref test for media.tough_video_cases_tbmv2 on win-high-dpi bot)
It's a known issue that ref builds do not work on Windows (for the same reason as https://github.com/catapult-project/catapult/issues/2610).

That's why the benchmark was disabled on reference; and it only got re-enabled when failures on reference builds were treated as non-fatal (orange).

The fix is to update the reference build.
What's the next step here?

Comment 10 by npm@chromium.org, Jan 12 2018

Labels: OS-Windows

Comment 11 by maxlg@chromium.org, Apr 23 2018

Is there any update for this bug?
Cc: -johnchen@chromium.org

Sign in to add a comment