New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 839430 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Feature



Sign in to add a comment

Produce and track metrics relevant to lab SLO

Reported by jrbarnette@chromium.org, May 3 2018

Issue description

In order to create a meaningful SLO for the devices in the CrOS
test lab, we need to track some specific metrics relevant to the
managed inventory:
  * Total availability - The number of DUTs in each of the following
    states: working, broken, idle.
  * Working spares - For a given model, the number of working DUTs
    minus the total size of all critical pools for the model.
  * Spares buffer - For a given model, the size of the 'suites' pool.
  * Time spent broken - The time (as a distribution) that all "broken"
    DUTs have spent in that state.
  * Time spent idle - The time (as a distribution) that all "idle" DUTs
    have spent in that state.

Currently, the regular inventory e-mail generates the first three metrics
twice a day, as part of the "model inventory" runs.  The "time spent broken"
and "time spent idle" metrics can be easily calculated.

Ideally, there might be a "time spent working" distribution as well.
However, within the current database and API, calculating "time spent
working" is hard, and there's no identified requirement for the information.

 
Labels: -Type-Bug Type-Feature

Comment 2 by nxia@chromium.org, May 3 2018

Components: -Infra>Client>ChromeOS Infra>Client>ChromeOS>Test
Owner: jrbarnette@chromium.org
Status: Assigned (was: Untriaged)
Richard, can you sketch out the overall design?  What metrics, what alerts, what dashboards, etc.

Sign in to add a comment