New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 732532 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jun 2017
Cc:
Components:
EstimatedDays: ----
NextAction: 2017-06-24
OS: Mac
Pri: 1
Type: Bug



Sign in to add a comment

battor.steady_state failing on chromium.perf/Mac Retina Perf

Project Member Reported by zh...@chromium.org, Jun 12 2017

Issue description

battor.steady_state failing on chromium.perf/Mac Retina Perf

Builders failed on: 
- Mac Retina Perf: 
  https://build.chromium.org/p/chromium.perf/builders/Mac%20Retina%20Perf


Charlie, there is no owner for this in the owner's file. So I will just cc you.
 
Project Member

Comment 2 by 42576172...@developer.gserviceaccount.com, Jun 12 2017


=== BISECT JOB RESULTS ===
Bisect failed for unknown reasons

Please contact the team (see below) and report the error.


Bisect Details
  Configuration: mac_retina_perf_bisect
  Benchmark    : battor.steady_state
  Metric       : benchmark_duration/benchmark_duration


To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests battor.steady_state

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8976959724680786496

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5305799842201600


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Project Member

Comment 4 by 42576172...@developer.gserviceaccount.com, Jun 13 2017


=== BISECT JOB RESULTS ===
NO Test failure found

Bisect Details
  Configuration: mac_retina_perf_bisect
  Benchmark    : battor.steady_state
  Metric       : benchmark_duration/benchmark_duration

Revision             Exit Code      N
chromium@477877      0 +- N/A       5      good
chromium@477931      0 +- N/A       5      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests battor.steady_state

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8976951494630046080

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5305799842201600


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!

Comment 5 by zh...@chromium.org, Jun 13 2017

Cc: simonhatch@chromium.org
Simon, do you have any clue why this bisect does not work?
Cc: dtu@chromium.org
+dtu for step question

I *think* that something is causing the test to either take way too long, or never finish. The swarming task seems to take over an hour, and the bisect in #c2 also takes over an hour on one of the steps and then the bot goes purple. I vaguely recall there being a 1 hour step limit with no output, Dave can you confirm?
Now that I understand swarming a bit better, it looks like the BattOr on 'build30-b4' is in some sort of wedged state (and has been for a few days since we tried to speed up traces). All BattOr tests are failing on this bot, including the system_health.common_desktop, battor.steady_state, and media.tough_video_cases benchmark.

Can we have someone from the labs team take a look at the BattOr on build30-b4 and report what its LED state is? We can also bring this back to life by pulling the plug on it and restarting it.
Yes, looking back it looks like this BattOr got wedged on June 7th (see attached screenshot).
Screen Shot 2017-06-17 at 9.48.14 AM.png
346 KB View Download
This appears to be caused by a fault in the SD card that lead to the BattOr being wedged. This is a BattOr firmware issue and it is being handled in https://github.com/aschulm/battor/issues/26.

However, for the time being, someone on the labs team should reboot the wedged BattOr on build30-b4.
Cc: benhenry@chromium.org
Added Ben to this bug. Ben, can you ask the Infra folks to power cycle the BattOr on build30-b4 and to pull out and put back in the SD card. This should fix the issue, but if not we may need to treat this as a failed SD card.
Components: Infra>Labs
Status: Untriaged (was: Available)
Putting it in the queue. Vince - check out comment #10 before finding someone to help.
Cc: aschulman@chromium.org
NextAction: 2017-06-24
Owner: rnep...@chromium.org
Status: Assigned (was: Untriaged)
Aaron recommended updating to BattOr firmware version edd3e4cc7ecdc2fdfdc7d3d739292e6f4533c944, which should fix this. Randy said that he'll go ahead and do this.
The NextAction date has arrived: 2017-06-24
/ping rnephew@
aschulman@, just to confirm: that BattOr firmware improvement fixes the timeout, but we should still expect SD cards to break, correct? What will we see from our end when this happens? Will anything be written to the serial connection?
Cc: nedngu...@google.com
 Issue 736120  has been merged into this issue.
Owner: ----
Status: Untriaged (was: Assigned)
Can someone on the labs team please turn off the BattOr, open it up, remove the SD card, put the SD card back in, close it back up, and plug it back in? aschulman@ says this should fix the issue.
Can we wait on that if just updating hte firmware will fix it? I am uploading the CL now. It would be nice to use this to confirm the fix.
Unfortunately this battor is wedged so the firmware can not be updated. This fix will make it so the BattOr does not wedge itself like this again (even if the SD card fails like it did in this case).

Someone will need to pull the power to unwedge the BattOr. There is a chance this will also reset. As Charlie said I also recommend reseating the SD card to make sure that was not the source of the problem.

This is the first of this type of error that we have seen, so we will have to take these steps and see if another one of these happens again.
Cc: jo...@chromium.org pschmidt@chromium.org
Labels: Type-Bug
Vince is out for the next couple of weeks. Adding some lab folks to see if they can help us pull the BattOr. Since none of you are in MTV, who can help me with this - and/or send me instructions on reflashing the SD card so I can take care of this.
afaik pschmidt has done this before. No flashing needed, you just need to pull the two power cables, unscrew the plastic enclosure, slide out the SD card, slide it back in, and plug the power cables back in.
Project Member

Comment 23 by bugdroid1@chromium.org, Jun 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/dd88b3fc27a12abff12c6c8f7797501ad85c2264

commit dd88b3fc27a12abff12c6c8f7797501ad85c2264
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Mon Jun 26 18:44:18 2017

Roll src/third_party/catapult/ 6af9db22b..3db1a306c (1 commit)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/6af9db22b57c..3db1a306cf17

$ git log 6af9db22b..3db1a306c --date=short --no-merges --format='%ad %ae %s'
2017-06-26 rnephew [BattOr] Update BattOr firmware in cloud storage.

Created with:
  roll-dep src/third_party/catapult
BUG= 732532 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: Ic961d47aa7104a53572680484426fdbdb9739a44
Reviewed-on: https://chromium-review.googlesource.com/548300
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#482338}
[modify] https://crrev.com/dd88b3fc27a12abff12c6c8f7797501ad85c2264/DEPS

I was ooto yesterday.  If you still want me to do something here please feel free to assign it my way.
Owner: pschm...@google.com
Peter, the BattOr on build30-b4 is wedged. Can you unplug and replug both of the power cables to the BattOr and let's see if that unwedges it.
Done, battor was power cycled.
Status: Fixed (was: Untriaged)
It's working again now. Thanks Peter! Hopefully the new firmware will prevent this issue from reoccurring.

Sign in to add a comment