buildbucket build 48hr timeout should result in understandable buildbot error |
||
Issue descriptionWhen a buildbucket build hits its 48 hour timeout, a few things go wrong: 1) The heartbeat the master sends to buildbucket gets a "BUILD_IS_COMPLETED" response, rather than something like "BUILD_TIMED_OUT". Buildbucket should return a more expressive/correct error code. 2) The master immediately closes the build without informing the builder of this fact, resulting in races when the master then tries to write additional logs to the closed build. The master should more gracefully handle closing down a timed-out build. See issue 791031 for the inspiration behind this one.
,
Dec 1 2017
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/ff9c111b393c314e889eb728c60b900d6663484c commit ff9c111b393c314e889eb728c60b900d6663484c Author: Nodir Turakulov <nodir@google.com> Date: Fri Dec 01 22:10:22 2017 [buildbucket] rename reset_expired_builds reset_expired_builds was originally named so because it reset expired builds only, but now it also marks 48h old builds as timed out. Rename it. Bug: 791124 Change-Id: I0797a5ccac7b0890337a588faeb583cb432ad55c Reviewed-on: https://chromium-review.googlesource.com/804235 Reviewed-by: Aaron Gable <agable@chromium.org> Commit-Queue: Nodir Turakulov <nodir@chromium.org> [modify] https://crrev.com/ff9c111b393c314e889eb728c60b900d6663484c/appengine/cr-buildbucket/cron.yaml [modify] https://crrev.com/ff9c111b393c314e889eb728c60b900d6663484c/appengine/cr-buildbucket/service.py [modify] https://crrev.com/ff9c111b393c314e889eb728c60b900d6663484c/appengine/cr-buildbucket/handlers.py [modify] https://crrev.com/ff9c111b393c314e889eb728c60b900d6663484c/appengine/cr-buildbucket/test/service_test.py [modify] https://crrev.com/ff9c111b393c314e889eb728c60b900d6663484c/appengine/cr-buildbucket/test/handlers_test.py
,
Dec 1 2017
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/305b22285cb669323b84d184f8ac14a33b2c5d14 commit 305b22285cb669323b84d184f8ac14a33b2c5d14 Author: Nodir Turakulov <nodir@google.com> Date: Fri Dec 01 22:36:32 2017 [buildbucket] improve error message for expired builds When a cron job marks a build as timed out, buildbot's heartbeat fails and the build is forcefully interrupted, but it is unclear why. Improve the returned error message. Bug: 791124 Change-Id: I30fad717afa1b8fd8f700a4f78c7e5794c2330c9 Reviewed-on: https://chromium-review.googlesource.com/803743 Reviewed-by: Aaron Gable <agable@chromium.org> Commit-Queue: Nodir Turakulov <nodir@chromium.org> [modify] https://crrev.com/305b22285cb669323b84d184f8ac14a33b2c5d14/appengine/cr-buildbucket/errors.py [modify] https://crrev.com/305b22285cb669323b84d184f8ac14a33b2c5d14/appengine/cr-buildbucket/proto/project_config_pb2.py [modify] https://crrev.com/305b22285cb669323b84d184f8ac14a33b2c5d14/appengine/cr-buildbucket/proto/service_config_pb2.py [modify] https://crrev.com/305b22285cb669323b84d184f8ac14a33b2c5d14/appengine/cr-buildbucket/service.py [modify] https://crrev.com/305b22285cb669323b84d184f8ac14a33b2c5d14/appengine/cr-buildbucket/test/errors_test.py [modify] https://crrev.com/305b22285cb669323b84d184f8ac14a33b2c5d14/appengine/cr-buildbucket/test/service_test.py
,
Dec 2 2017
this is two separate bugs. (2) seems to be a buildbot bug. The CLs above only fix buildbucket bugs
,
Dec 7 2017
|
||
►
Sign in to add a comment |
||
Comment 1 by no...@chromium.org
, Dec 1 2017Status: Started (was: Untriaged)