New issue
Advanced search Search tips

Issue 776096 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: 2017-10-19
OS: Windows
Pri: ----
Type: ----



Sign in to add a comment

BattOr attached to build180-b4 needs to be restarted

Project Member Reported by charliea@google.com, Oct 18 2017

Issue description

media.desktop failing on chromium.perf/Win 10 High-DPI Perf

Revision range seems to be 43d5d2c65be39b04649943de8a05e201198da951 (good) to 4efd8b40b79a64112ec87267187e998684e7958a (bad) and went from passing to 100% failing. Going to disable the story and kick off a return code bisect.

Builders failed on: 
- Win 10 High-DPI Perf: 
  https://build.chromium.org/p/chromium.perf/builders/Win%2010%20High-DPI%20Perf



 

Comment 1 by charliea@google.com, Oct 18 2017

Didn't mention: it seems that the failing story is reliably video.html?src=crowd.ogg&type=audio
Cc: sullivan@chromium.org dalecur...@chromium.org
Components: Tests>Telemetry Internals>Media
The problem is with battOr: 

https://chromium-swarm.appspot.com/task?id=3947927206c4f210&refresh=10&show_raw=1

Traceback (most recent call last):
  RunBenchmark at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py:334
    max_num_values=benchmark.MAX_NUM_VALUES)
  Run at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py:202
    _RunStoryAndProcessErrorIfNeeded(story, results, state, test)
  _RunStoryAndProcessErrorIfNeeded at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py:96
    test.WillRunStory(state.platform)
  WillRunStory at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\web_perf\timeline_based_measurement.py:272
    platform.tracing_controller.StartTracing(self._tbm_options.config)
  StartTracing at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\core\tracing_controller.py:43
    self._tracing_controller_backend.StartTracing(tracing_config, timeout)
  StartTracing at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\platform\tracing_controller_backend.py:91
    started = agent.StartAgentTracing(config, timeout)
  StartAgentTracing at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\platform\tracing_agent\battor_tracing_agent.py:73
    self._battor.StartTracing()
  StartTracing at c:\b\s\w\ir\third_party\catapult\common\battor\battor\battor_wrapper.py:235
    self._SendBattOrCommand(self._START_TRACING_CMD)
  _SendBattOrCommand at c:\b\s\w\ir\third_party\catapult\common\battor\battor\battor_wrapper.py:359
    'Outputted: %s' % (cmd, status))
BattOrError: BattOr did not complete command 'StartTracing' correctly.
Outputted: [1018/023722.617:FATAL:battor_agent_bin.cc(100)] Fatal error when communicating with the BattOr: TOO MANY COMMAND RETRIES

Not sure why disabling our benchmark when BattOr is broken is considered the right course of action. It would make much more sense to have failures in the battOr code not fail our whole benchmark (since the other data collected would still be useful). This would be similar to how it media.desktop works on devices that don't have BattOrs attached to them.

Comment 4 by charliea@google.com, Oct 18 2017

Summary: media.desktop/video.html?src=crowd.ogg&type=audio failing on chromium.perf/Win 10 High-DPI Perf (was: media.desktop failing on chromium.perf/Win 10 High-DPI Perf)

Comment 5 by charliea@google.com, Oct 18 2017

Components: -Tests>Telemetry -Internals>Media Infra>Labs
NextAction: 2017-10-19
Owner: ----
Summary: BattOr attached to build180-b4 needs to be restarted (was: media.desktop/video.html?src=crowd.ogg&type=audio failing on chromium.perf/Win 10 High-DPI Perf)
Sorry for the churn - I saw other stories failing on media.desktop earlier, then saw this story failing, so assumed that there were other stories in the benchmark that were passing and that this was the only failing story. Disabling this definitely wasn't the correct course of action in this case.

I'm working on making Telemetry work in the way that you described (the entire story doesn't fail if the BattOr fails, just some metrics aren't reported). You're definitely right - that's the way it should work.

Meanwhile, I'm going to kick this over to Infra labs. 

Infra>Labs, could you please restart the BattOr attached to build180-b4?

Comment 6 by charliea@google.com, Oct 18 2017

Labels: OS-Windows

Comment 7 by jo...@google.com, Oct 18 2017

Owner: jo...@chromium.org
Status: Assigned (was: Available)
Project Member

Comment 8 by 42576172...@developer.gserviceaccount.com, Oct 18 2017


=== BISECT JOB RESULTS ===
NO Test failure found

Bisect Details
  Configuration: winx64_high_dpi_perf_bisect
  Benchmark    : media.desktop
  Metric       : cpu_time_percentage_avg/video.html?src_crowd.ogg_type_audio

Revision             Exit Code      N
chromium@508967      0 +- N/A       2      good
chromium@508991      0 +- N/A       2      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=video.html.src.crowd.ogg.type.audio media.desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8965364701067099232


For feedback, file a bug with component Speed>Bisection

Comment 9 by jo...@google.com, Oct 18 2017

Battor restarted and indication light look normal again.
Status: Fixed (was: Assigned)
Thanks John!
The NextAction date has arrived: 2017-10-19

Sign in to add a comment