New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 774588 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

SoM not generating CQ alerts

Project Member Reported by dgarr...@chromium.org, Oct 13 2017

Issue description

SoM has stopped generating alerts for CQ failures.
 

Comment 1 by nxia@chromium.org, Oct 13 2017

It looks like som_alserts_dispatcher doesn't look into master failures if the master has slaves. 

https://cs.corp.google.com/chromeos_public/chromite/scripts/som_alerts_dispatcher.py?q=som_alerts&dr&l=522

Comment 2 by nxia@chromium.org, Oct 13 2017

I looked into a master-paladin 1926371, which didn't have local failures in master-paladin. So I think the alerts can alert failures from the master build but ignore ImportantBuilderFailedException 

mysql> select * from failureView where build_id=1926371 \G;
*************************** 1. row ***************************
                 id: 2186607
     build_stage_id: 58496281
   outer_failure_id: NULL
     exception_type: ImportantBuilderFailedException



Comment 3 by nxia@chromium.org, Oct 13 2017

I thought the master only recorded ImportantBuilderFailedException when its important slave builders failed, but it turns out the master also records ImportantBuilderFailedException when itself fails. 

so I think I can fix the completion stage first, record a special exception like ImportantSlaveBuilderFailedException when the slaves of the master failed, then SoM can ignore if ImportantSlaveBuilderFailedException is the only failure in a master build.

Comment 4 by nxia@chromium.org, Oct 13 2017

Talked to davidriley@ offline.

I had concerns that if the master had exceptions before the completion stage, the exceptions might be converted into ImportantBuilderFailedException. But after I checked the code, there's no need to worry about this.

So we're good with the statement "if we get ImportantBuilderFailedException
it's fine as long as we get another exception"
Project Member

Comment 5 by bugdroid1@chromium.org, Oct 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/8bdea257f1cdb66eb407a39abd1c694e140e90eb

commit 8bdea257f1cdb66eb407a39abd1c694e140e90eb
Author: David Riley <davidriley@chromium.org>
Date: Mon Oct 16 21:14:01 2017

som_alerts_dispatcher: Generate alerts for master failures.

Master build failures used to not generate alerts if there were any
slaves.  Now generate alerts but ignore failures that are of type
ImportantBuilderFailedException.

BUG= chromium:774588 
TEST=/som_alerts_dispatcher --som_host sheriff-o-matic-staging.appspot.com --som_tree chromeos ~cros/.cache/cidb_creds 1926371,1000
TEST=/som_alerts_dispatcher --som_host sheriff-o-matic-staging.appspot.com --som_tree chromeos ~cros/.cache/cidb_creds 1940777,1000

Change-Id: I35fe5f8626c0fdbd5aca4f758f0f2e724d7453a3
Reviewed-on: https://chromium-review.googlesource.com/719425
Commit-Ready: David Riley <davidriley@chromium.org>
Tested-by: David Riley <davidriley@chromium.org>
Reviewed-by: Ningning Xia <nxia@chromium.org>

[modify] https://crrev.com/8bdea257f1cdb66eb407a39abd1c694e140e90eb/scripts/som_alerts_dispatcher.py

Status: Fixed (was: Untriaged)

Sign in to add a comment