New issue
Advanced search Search tips

Issue 848947 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

LUCI Schedule double scheduling a build?

Project Member Reported by dgarr...@chromium.org, Jun 1 2018

Issue description

This scheduler:

https://luci-scheduler.appspot.com/jobs/chromeos/amd64-generic-tot-chromium-pfq-informational

Is currently running two active builds:

Build A:

https://luci-scheduler.appspot.com/jobs/chromeos/amd64-generic-tot-chromium-pfq-informational/9110377558185414384

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8944947495923138096

Build B:

https://luci-scheduler.appspot.com/jobs/chromeos/amd64-generic-tot-chromium-pfq-informational/9110312412330160128

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8944882350077720256



When I check the buildbucket API, both buildbucket ids report a status of STARTED. The older build appears to be hung, and I'm expected it to be killed when the buildbucket 24 hour timeout is hit.

The LUCI scheduler configuration was updated in the middle of this, possibly including the configuration for this job.
 
This is due to brief Buildbucket outage that happened ~3:00-3:10 (https://outalator.corp.google.com/#Outage:1050627).

During the outage all request to Buildbucket returned 404, and Scheduler interpreter it as "no such build" and carried on to start a new build.
Status: WontFix (was: Untriaged)
From my PoV, this is WontFix in scheduler. Retrying 404s isn't really good.
Are the builds in question really still running, or are they lying to the UI?

If they are going to stick around in status STARTED forever, that would be bad. If they will be cleaned up after 24 hours, no big deal.
PS: I agree about the scheduler priority.

Sign in to add a comment