New issue
Advanced search Search tips

Issue 873206 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Add a dashboard showing test and suite abort rates

Reported by jrbarnette@chromium.org, Aug 10

Issue description

Recently, there was a test outage caused by a failing database slave
(see bug 870253 and go/requiem/doc/postmortem103237).  The principle
symptom of the outage was high rates of test and/or suite aborts.

One problem identified in the post mortem was the lack of visible
symptoms indicating the outage on our dashboards.  Bug 841573
requests an alert suite abort failures exceed a threshold.  However,
we have no dashboard showing abort rates.  There should be such a
dashboard.  We should consider graphs for any/all of the following
metrics:
  * Suite aborts per minute
  * Suite abort percentage
  * Test aborts per minute
  * Test abort percentage

For each of the metrics, it would be helpful to filter based on
various criteria, such as:
  * For the lab as a whole
  * For selected shards
  * For selected boards
  * For selected models
  * For selected drones

 

Sign in to add a comment