Produce and track metrics relevant to lab SLO
Reported by
jrbarnette@chromium.org,
May 3 2018
|
|||
Issue description
In order to create a meaningful SLO for the devices in the CrOS
test lab, we need to track some specific metrics relevant to the
managed inventory:
* Total availability - The number of DUTs in each of the following
states: working, broken, idle.
* Working spares - For a given model, the number of working DUTs
minus the total size of all critical pools for the model.
* Spares buffer - For a given model, the size of the 'suites' pool.
* Time spent broken - The time (as a distribution) that all "broken"
DUTs have spent in that state.
* Time spent idle - The time (as a distribution) that all "idle" DUTs
have spent in that state.
Currently, the regular inventory e-mail generates the first three metrics
twice a day, as part of the "model inventory" runs. The "time spent broken"
and "time spent idle" metrics can be easily calculated.
Ideally, there might be a "time spent working" distribution as well.
However, within the current database and API, calculating "time spent
working" is hard, and there's no identified requirement for the information.
,
May 3 2018
,
May 8 2018
Richard, can you sketch out the overall design? What metrics, what alerts, what dashboards, etc. |
|||
►
Sign in to add a comment |
|||
Comment 1 by jrbarnette@chromium.org
, May 3 2018