New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 847899 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

ImportantSlaveNotRunning alert misfired

Reported by jrbarnette@chromium.org, May 30 2018

Issue description

Overnight, the ImportantSlaveNotRunning alert went off.  The alert
happened because the jadeite-release builder (a canary slave) had
missed two scheduled runs since the regular 11:00 AM run on 5/29.

The alert was correct that the builder hadn't run, but the message
wasn't relevant:  The slave had been turned off in goldeneye and wasn't
meant to run.  We should figure out how to make this alert not go off
in such a circumstance.

Here's the relevant text of the most recent message:

Alert Details
------------------
Description:
An important slave has not run in 1 day.

name: ImportantSlaveNotRunning
current value: 0
threshold: Lt(1) for 3h
alert fields: {, metric:slave_config=jadeite-release,metric:master_config=master-release}

sent at: 2018-05-30 05:49:28
active since: 2018-05-30 02:48:58 (3 hours 0 mins)

 
Cc: cra...@chromium.org

Comment 2 by cra...@chromium.org, May 30 2018

Sounds to me like it isn't really an "important" slave.  What defines "important"?

Status: Available (was: Untriaged)

Sign in to add a comment