New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 672631 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 1
Type: Bug

Blocking:
issue 671768



Sign in to add a comment

BattOrs reset when they're sent StartTracing command

Project Member Reported by charliea@chromium.org, Dec 8 2016

Issue description

There's a known problem with the latest version of the BattOr firmware where, after a successful initialization sequence, BattOrs will reset when they're sent the START_TRACING command. When they reset, they emit "0x00" to the serial connection.

We're awaiting a new firmware version from them that fixes the problem, although they're having problems reproducing it. In the meanwhile, we should take steps on our end to mitigate it, as it's causing lots of flakiness on the waterfall.

I'm going to mark this as P1 until a short-term workaround is implemented, at which point I'll downgrade it to P2.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Dec 9 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ee8618f9e765917145cc20b136cb6a3ba324db53

commit ee8618f9e765917145cc20b136cb6a3ba324db53
Author: charliea <charliea@chromium.org>
Date: Fri Dec 09 15:43:55 2016

Make the BattOr agent retry start tracing when we fail to read an ack

This should greatly mitigate a current bug in the BattOr firmware where
sending the command to start tracing causes the BattOr to reset. In this
case, we now retry the full initialization sequence. Because the error
we're hitting is a timeout, and we allow 4 seconds for a command to
be executed before timing out, this means that initialization will
retry after 4 seconds and then make 20 attempts to initialize the BattOr
100 milliseconds apart from each other, making the whole sequence take
something like 6 additional seconds. We can retry this full sequence
up to 5 times. When added to the five or so seconds that the BattOr
already takes to initialize, this means that this full sequence might
take up to 11 * 5 = 55 seconds before deciding that the BattOr can't
be initialized. In practice, I suspect that almost all initializations
will succeed on either their first or second attempts.

BUG= 672631 

Review-Url: https://codereview.chromium.org/2563033002
Cr-Commit-Position: refs/heads/master@{#437548}

[modify] https://crrev.com/ee8618f9e765917145cc20b136cb6a3ba324db53/tools/battor_agent/battor_agent.cc
[modify] https://crrev.com/ee8618f9e765917145cc20b136cb6a3ba324db53/tools/battor_agent/battor_agent.h
[modify] https://crrev.com/ee8618f9e765917145cc20b136cb6a3ba324db53/tools/battor_agent/battor_agent_unittest.cc
[modify] https://crrev.com/ee8618f9e765917145cc20b136cb6a3ba324db53/tools/battor_agent/battor_error.cc
[modify] https://crrev.com/ee8618f9e765917145cc20b136cb6a3ba324db53/tools/battor_agent/battor_error.h

Project Member

Comment 2 by bugdroid1@chromium.org, Dec 10 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9f0a98a3c001f7602338311a4765eaac0bed4376

commit 9f0a98a3c001f7602338311a4765eaac0bed4376
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Sat Dec 10 04:02:47 2016

Roll src/third_party/catapult/ 707aaac64..19565fdb1 (2 commits).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/707aaac64b3c..19565fdb148a

$ git log 707aaac64..19565fdb1 --date=short --no-merges --format='%ad %ae %s'
2016-12-09 nednguyen Revert of re-enabled netlog viewer dev server tests (patchset #2 id:20001 of https://codereview.chromium.org/2553213002/ )
2016-12-09 charliea Roll battor_agent_bin to include new init retry logic

BUG= 672631 

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2567543003
Cr-Commit-Position: refs/heads/master@{#437746}

[modify] https://crrev.com/9f0a98a3c001f7602338311a4765eaac0bed4376/DEPS

Issue 650426 has been merged into this issue.
Yes, those appear to be related to this.

Comment 6 by eyaich@chromium.org, Dec 20 2016

Blocking: 671768
Status: Fixed (was: Assigned)
Marking this as fixed now that we have the init retry workaround logic in place.

Sign in to add a comment