New issue
Advanced search Search tips

Issue 702206 link

Starred by 2 users

Issue metadata

Status: Archived
Owner: ----
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Alerts disappeared on sheriff-o-matic for chromium.perf

Project Member Reported by sullivan@chromium.org, Mar 16 2017

Issue description

https://sheriff-o-matic.appspot.com/chromium.perf
https://sheriff-o-matic.appspot.com/chromium.perf?useMilo=true

There are unfortunately a lot of failures, so I'm pretty sure there is a problem in sheriff-o-matic.
 
Status: Available (was: Untriaged)
checking it out now
And they're back, at least for the moment?
You can see there was a large spike in the number of alerts around 5:30am MTV time, then it disappeared around 7:20 (perf is the blue line):
https://viceroy.corp.google.com/chrome_infra/Services/alerts_dispatcher

And now it's coming back. I don't see anything in the logs so far that jumps out as an explanation though.
Looks like it failed to get a perf build extract from CBE, based on output from a run during that 0-alert period: https://luci-logdog.appspot.com/v/?s=infra-internal%2Fbb%2Finternal.infra.cron%2Falerts-dispatcher%2F44281%2F%2B%2Frecipes%2Fsteps%2Frun_alerts_dispatcher%2F0%2Fstdout

[E2017-03-16T07:21:50.674058-07:00 32595 0 client.go:370] Error (500) fetching https://chrome-build-extract.appspot.com/get_master/chromium.perf: error fetching https://chrome-build-extract.appspot.com/get_master/chromium.perf, max retries exceeded
[E2017-03-16T07:21:50.674155-07:00 32595 0 dispatcher.go:147] Error reading build extract from chromium.perf : error fetching https://chrome-build-extract.appspot.com/get_master/chromium.perf, max retries exceeded
[I2017-03-16T07:21:50.674191-07:00 32595 0 dispatcher.go:192] Build Extracts read: 0
[E2017-03-16T07:21:50.674220-07:00 32595 0 dispatcher.go:259] Posting alerts to https://sheriff-o-matic.appspot.com/api/v1/alerts/chromium.perf

One issue: If a-d can't get a build extract, it shouldn't post empty alerts. It should just log an error and continue.

Another troubling thing is that ?useMilo didn't work either. So CBE probably wasn't the problem, since it now just proxies Milo.


This is a known problem. The solution is to use milo on the app engine cron job. 

I've been running a cron job manually on my workstation as a workaround. I'll start that running now.
Labels: Milestone-Perf
Cc: dtu@chromium.org
 Issue 687708  has been merged into this issue.
Status: Archived (was: Available)

Sign in to add a comment