New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Last visit 19 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocked on:
issue 819882



Sign in to add a comment

coral-release fails HWTests - DUTs look really unstable

Project Member Reported by norvez@chromium.org, Feb 23

Issue description

https://luci-milo.appspot.com/buildbot/chromeos/coral-release/

The failure is not always the same (sometimes provisioning, sometimes instability in the tests) but are so frequent that it's probably a symptom of a generic instability.

Various failures to provision (device not coming back up from reboot, not rebooting, ...):
https://luci-milo.appspot.com/buildbot/chromeos/coral-release/779

Some provisioning errors, mixed with very flaky tests:
https://luci-milo.appspot.com/buildbot/chromeos/coral-release/778

Assigning to vineeths@ who's agreed to help find an owner
 
Cc: nsanders@chromium.org
Owner: shapiroc@chromium.org
Assigning to Charles.

Looks like Coral's have been red for some time now possibly because Coral's need many devices to pass for a green as opposed to other devices which require only one.

Example Reks here, which looks much better:

https://luci-milo.appspot.com/buildbot/chromeos/reks-release/#
Owner: akes...@chromium.org
I've never found anything that wasn't infra related.

Since coral runs 13 models, it's probably of getting hit by transient failures goes up by 13x.

coral seems to fail provisioning at a pretty high rate (6-7%). Compounded with the fact that we run so many models this means that a lot of runs are going to be knocked out by this alone. I'll do some more digging here.
Not yet, I've been sidetracked by last-minute modemfwd changes. I'll do more investigation tomorrow.
Blockedon: 819882
Components: Infra>Client>ChromeOS>Test
Cc: vsu...@chromium.org pmanavalan@chromium.org kkan...@chromium.org rohi...@chromium.org akes...@chromium.org dhadd...@chromium.org
 Issue 826903  has been merged into this issue.
Coral is still failing pretty widely.

Is there a way forward that isn't blocked on the logging feature request at Issue 819882 ?
Cc: vbendeb@chromium.org mruthven@chromium.org
Are there any logs of specific machines failing? I see mention of Cr50 failure on the other bug.

Cr50 update will fail by rollback protection if requested to downrev (like if you test a build from master, then try to test an M65 build). I'd expect this to fail gracefully but maybe there's some reporting bug there. Coral is one of the first devices to see regular Cr50 updates so it's plausible that this could be a factor.

Sign in to add a comment