most canaries are twice-purple with no indication of manual intervention |
|||||||||||
Issue descriptionSubject says it all. For instance: https://uberchromegw.corp.google.com/i/chromeos/builders/samus-release/builds/3960 Some were purple twice in a row, others only once. There was no obvious announcement of build stoppages. We would like to know what happened so we can exclude a malfunction.
,
Jan 24 2017
Heartbeat failed with error "" (reason "BUILD_IS_COMPLETED") This is a BuildBucket error: https://cs.chromium.org/chromium/infra/appengine/cr-buildbucket/api.py?q=BUILD_IS_COMPLETED&sq=package:chromium&dr=C&l=32 Any thoughts on what could have done this, nodir@?
,
Jan 24 2017
this buildbot build is associated with buildbucket build 8989593815995856800, which was cancelled at 2017-01-24 14:55:04 UTC by 446450136466-mko2u1g65l7iqsos5c09tni364ejqg75@developer.gserviceaccount.com, the same service account that scheduled the build. This is chromeos account https://chrome-internal.googlesource.com/infra/puppet/+/master/puppetm/etc/puppet/hieradata/credentials/default.eyaml#429 I assume this is a chromeos service that manages builds?
,
Jan 24 2017
canary master realized that these got cancelled, but the logs do not indicate that the master cancelled them: 06:54:00: INFO: 2:56:49.771849 until timeout... 06:55:16: INFO: Running cidb query on pid 29837, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x4c3ca10; Select object> 06:55:21: INFO: cidb query succeeded after 1 retries 06:55:21: INFO: Not retriable build veyron_speedy-release started already. .. 06:55:21: INFO: Build config veyron_speedy-release completed with status "CANCELED".
,
Jan 24 2017
afaict, we currently do not cancel any release builds under any circumstances.
,
Jan 24 2017
So talked to Nodir and w/ #3, BUILD_IS_COMPLETED will be returned if the build is cancelled. Don't master builders cancel all previous builds before scheduling new ones? If so, I would bet that someone manually or automatically started a new master builder, causing all current builds to become cancelled and, consequently, halt with that error. Seem plausible?
,
Jan 24 2017
+nxia: The in-house expert on when we murder builds (or give them the resurrection potion).
,
Jan 24 2017
,
Jan 24 2017
Recently R57 was cut and got run on chromeos_release waterfall. It also runs master-release build but on a different waterfall and branch. It also tries to cancel the slave release builds in the chromeos waterfall in its cleanup stage. Having a fix CL at https://chromium-review.googlesource.com/#/c/432004/. Will have another CL to add branch and waterfall tags to the build, so that cleanup stage only searches for the builds with the right tags.
,
Jan 25 2017
,
Jan 26 2017
I don't see these particular aborts last night. Let's call it fixed.
,
Jan 26 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by semenzato@chromium.org
, Jan 24 2017