New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 848545 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug


Participants' hotlists:
SIE-infra-request


Sign in to add a comment

autotest: hardware validation test on DUT failure

Project Member Reported by nsanders@chromium.org, Jun 1 2018

Issue description

DUT hardware fails sometimes, and currently causes tree flakes until diagnosed manually.

We should have a hardware diagnostic that runs before or after repair to check for common problems, including battery failure, power failure, port failure, storage failure, memory failure, missing components, thermal failure, etc.
 
Components: Infra>Client>ChromeOS>Test
Cc: matth...@chromium.org
Owner: nsanders@chromium.org
> [ ... ] a hardware diagnostic that runs before or after repair [ ... ]

Doing the work _during_ the repair task might be a better choice, as a
way to simplify the scheduler.

However, only choosing to do the work during repair might not be the best
or most effective choice.  Just because we ran repair doesn't mean that
the hardware is suspect.  Also, there's no guarantee that devices with hardware
problems will predictably wind up in repair tasks; they may simply fail
certain tests while reliably passing all verification checks.

An alternative proposal for how to schedule the diagnostics is here:
    https://docs.google.com/document/d/1zIvIqwRbRtF2HP2a9pPti6dXMeq5ejHviFxsMsp_SWw/edit#heading=h.bfnmwg8natdi

The basic idea there is that every DUT should run the hardware
diagnostic tests, and should re-run them whenever test results seem
too stale.
More broadly, we should probably split this work into two loosely coupled pieces:
 1) Write (and maintain) diagnostics for Chrome hardware.
 2) Set up a system that ensures that we run the diagnostics, and that
    failures are reported and acted on.

Status: Assigned (was: Untriaged)
The first step here is to build out such diagnostics, if they don't exist already.  This would be helpful even if the deputy needed to run them manually to check if a DUT is problematic.

nsanders, are you the right person to move this forward?
Yes, I think so.
Cc: englab-sys-cros@google.com

Sign in to add a comment