New issue
Advanced search Search tips

Issue 874196 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Aug 17
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Milo panics

Project Member Reported by athilenius@chromium.org, Aug 14

Issue description

I have a NodeJS AppEngine application that (as a temporary solution until ChromeOS moves fully to swarming and we can use BuildBucket V2) polls data from Buildbucket backwards in time and for each build queries Milo BuildInfo. Occasionally, when starting the app, I get 500s back from Milo which correspond to the given log entries:
https://pantheon.corp.google.com/logs/viewer?project=luci-milo&minLogLevel=0&expandAll=false&timestamp=2018-08-14T19:23:12.983000000Z&customFacets=&limitCustomFacetWidth=true&interval=PT1H&resource=gae_app%2Fmodule_id%2Fdefault&scrollTimestamp=2018-08-14T19:22:46.002846000Z&filters=path:prpc&filters=status:500&filters=useragent:the-em-drive&logName=projects%2Fluci-milo%2Flogs%2Fappengine.googleapis.com%252Frequest_log&dateRangeStart=2018-08-14T18:23:12.983Z&dateRangeEnd=2018-08-14T19:23:12.983Z

It seems (to a layman) like it might be a race condition with Buildbucket because it only happens when I first start the app, and only sometimes. That means it's trying to fetch BuildInfo for a very recently created build in Buildbucket. If I can successfully load the first (few?) builds, then it does not appear to error ever, even after thousands of builds.
 
Cc: hinoka@chromium.org
I take back my last, these panics seem to happen at other times as well. Apparently I just had a lucky run at one point. I am seeing a LOT of 500s come back in the last few runs:

Milo error in buildbucketID: 8938177490901835840 with status code: 500
Milo error in buildbucketID: 8938177490901835840 with status code: 500
Milo error in buildbucketID: 8938177489872362176 with status code: 500
Milo error in buildbucketID: 8938177489872362176 with status code: 500
Milo error in buildbucketID: 8938178052623895472 with status code: 500
Milo error in buildbucketID: 8938177782485887184 with status code: 500
Milo error in buildbucketID: 8938178052623895472 with status code: 500
Milo error in buildbucketID: 8938177782485887184 with status code: 500
Loaded chunk. Builds: 100. Stages: 2274.
Loaded chunk. Builds: 200. Stages: 4955.
Milo error in buildbucketID: 8938183888018869568 with status code: 500
Milo error in buildbucketID: 8938183888018869568 with status code: 500
All pending builds up to date
Labels: -Pri-2 Pri-1
Look like the error here isn't being checked? I'm not a Go expert though :p
https://chromium.googlesource.com/infra/luci/luci-go/+/master/milo/buildsource/swarming/buildinfo.go#89
Sorry for so many posts, it looks like it's specific builds causing this. A simple repro case is buildbucket ID 8938095596697800400: https://luci-milo.appspot.com/rpcexplorer/services/milo.BuildInfo/Get?request={%20%20%20%20%22buildbucket%22:%20{%20%20%20%20%20%20%20%20%22id%22:%20%228938095596697800400%22%20%20%20%20}}
Project Member

Comment 6 by bugdroid1@chromium.org, Aug 15

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/luci-go.git/+/2c62790ac1d94297cf4fc61127ade194c0965b7f

commit 2c62790ac1d94297cf4fc61127ade194c0965b7f
Author: Nodir Turakulov <nodir@google.com>
Date: Wed Aug 15 21:50:03 2018

[milo] Check error returned by ReadAnnotations

GetBuildInfo in swarming/buildinfo.go didn't check error returned by
ReadAnnotations

Bug:  874196 
Change-Id: I248c25451c5b5b442055c5541a03cbd3946d626c
Reviewed-on: https://chromium-review.googlesource.com/1176383
Reviewed-by: Vadim Shtayura <vadimsh@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/2c62790ac1d94297cf4fc61127ade194c0965b7f/milo/buildsource/swarming/buildinfo.go

it does not panic anymore, but now it returns 404...
Status: Fixed (was: Untriaged)
ah, i wasn't signed in. Works now
and thanks for finding the bug!
Woo hoo! Thank you Nodir, much appreciated!
Owner: no...@chromium.org
Status: Started (was: Available)

Sign in to add a comment