Issue metadata
Sign in to add a comment
|
media.tough_video_cases_tbmv2 failing on 2 builders |
||||||||||||||||||||||
Issue descriptionmedia.tough_video_cases_tbmv2 failing on 2 builders Builders failed on: - Mac Air 10.11 Perf: https://build.chromium.org/p/chromium.perf/builders/Mac%20Air%2010.11%20Perf - Mac Retina Perf: https://build.chromium.org/p/chromium.perf/builders/Mac%20Retina%20Perf This is failing fairly consistently. This log is present fairly consistently: (ERROR) 2017-08-03 00:10:49,030 battor_wrapper._FlashBattOr:175 Git hash returned from BattOr was not as expected: [0803/001049.029353:FATAL:battor_agent_bin.cc(91)] Fatal error when communicating with the BattOr: TOO MANY COMMAND RETRIES Traceback (most recent call last): File "/b/s/w/ir/third_party/catapult/common/battor/battor/battor_wrapper.py", line 162, in _FlashBattOr device_git_hash = self.GetFirmwareGitHash() File "/b/s/w/ir/third_party/catapult/common/battor/battor/battor_wrapper.py", line 392, in GetFirmwareGitHash int(self._git_hash, 16) ValueError: invalid literal for int() with base 16: '[0803/001049.029353:FATAL:battor_agent_bin.cc(91)] Fatal error when communicating with the BattOr: TOO MANY COMMAND RETRIES' charliea@, could this be causing the test to fail? Can you take a look? Will disable this test on mac.
,
Aug 3 2017
Looking at the logs, it looks like the whole suite is failing. That test is just the first test to try to run. We're better off asking +Vince if something is wrong with the BattOr on that device.
,
Aug 3 2017
Ah ok, I thought there was only one story the benchmark ran. My bad. Vince, can someone from labs look at the bot to see if the battor is messed up or something?
,
Aug 4 2017
Sorry for the delay: I'm baffled about why the serial logs aren't getting uploaded here. Adding Infra>Labs: could you please take a look at the BattOr on this machine and let us know what the blinking pattern is? Could you also please unplug and replace the power cable running to it?
,
Aug 4 2017
LED pattern was solid red (no blinks) on build127-b1. Reset the power connections to the battor and it's blinking orange now. Do you still need the power cable replaced? If so, are we talking about the adapter side, or the laptop side? (I don't see any spares in the vicinity, at first glance). Thanks.
,
Aug 4 2017
After looking more closely, it looks like what happened is that 1) Telemetry tried to flash the BattOr, but was unsuccessful for some reason. 2) While trying to recover from the failed flash, it tried to stop the BattOr shell (https://cs.chromium.org/chromium/src/third_party/catapult/common/battor/battor/battor_wrapper.py?type=cs&q=%22git+hash+returned%22&sq=package:chromium&l=176). This first tries to stop the shell gracefully if it's still running, but falls back to a simple kill if it times out trying to stop the shell gracefully. 3) The shell was still running, so it tried to stop it gracefully. However, there was a race condition between checking if the shell was still running and actually gracefully requesting the shutdown, and in the meanwhile, the shell died. Because we only catch TimeoutExceptions when trying to gracefully kill the shell, the "broken pipe" exception that signaled the failed shutdown propagated outwards, killing Telemetry and doing so without ever uploading the serial logs to cloud storage. It's hard to say exactly what happened to the BattOr to cause the initial failures without the serial logs, but a clear outcome of this is that we need to catch *all* exceptions during StopShell(), not just TimeoutExceptions.
,
Aug 4 2017
Thanks! Sorry, my wording was terrible. I meant "unplug and replug"... not "unplug and replace". What you did was exactly what I was hoping for :-)
,
Aug 4 2017
,
Aug 4 2017
Setting NextAction until Monday, when we can check if this is working yet.
,
Aug 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/720c8ddf5c5f442062ec595c02e644e631f6ed33 commit 720c8ddf5c5f442062ec595c02e644e631f6ed33 Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Sat Aug 05 06:08:48 2017 Roll src/third_party/catapult/ 0fb50e3f8..33a9271eb (4 commits) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/0fb50e3f84ef..33a9271eb3cf $ git log 0fb50e3f8..33a9271eb --date=short --no-merges --format='%ad %ae %s' 2017-08-04 achuith Disable testSmokeStartingWebPageReplayGoServer on chromeos. 2017-08-04 nednguyen Add markdown version of run_telemetry_tests documentation 2017-08-04 charliea Catch all BattOr shell graceful shutdown failures 2017-08-04 xunjieli [wpr-go] Update README Created with: roll-dep src/third_party/catapult BUG=750323, 752270 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: Ia77bd543b748caad646d357f7cedc4724604fe6a Reviewed-on: https://chromium-review.googlesource.com/602745 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#492231} [modify] https://crrev.com/720c8ddf5c5f442062ec595c02e644e631f6ed33/DEPS
,
Aug 7 2017
The NextAction date has arrived: 2017-08-07
,
Aug 21 2017
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by crouleau@chromium.org
, Aug 3 2017