New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 738179 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Pre-CQ flake: The InitSDK stage failed: (15, 'Received signal 15; shutting down')

Project Member Reported by ayatane@chromium.org, Jun 29 2017

Issue description

I keep getting hit by this pre-cq flake that unmarks all my CLs.

This is a tracking bug to justify that this happens often enough to be worth looking at.

https://chromium-review.googlesource.com/c/549087/
https://chromium-review.googlesource.com/c/549088/
 
Owner: dgarr...@chromium.org
Status: Assigned (was: Untriaged)
The timing of the failure matches very closely the
timing of restarting the pre-cq launcher mentioned in
 bug 737695 .

dgarrett@ - if that's a plausible theory, I think you
should close this with WontFix.  Otherwise, bounce it
back.

Cc: nxia@chromium.org
Wasn't there a waterfall restart? That kills all in-progress builds, and the CLs get blamed.
Owner: nxia@chromium.org
Looking a little more closely, I'm confused.

Those two CLs show the build failures BEFORE any comment showing they were picked up by the PreCQ.

And... the first CL was marked PreCQ ready when the builds started, but the second one wasn't.

So... why was the second CL included in the PreCQ run?

Further, the second CL was rebased very shortly before the PreCQ builder started. Ningning added logic to kill PreCQ builds if the CLs are rebased, maybe the builds took a while to start and that was triggered.

But the timeline seems a little off.

Comment 6 by nxia@chromium.org, Jul 6 2017

Status: Started (was: Assigned)
the reason is the pre-cqs for the old patch-set were cancelled when a new patch-set was uploaded, but the failure messages of the aborted pre-cqs weren't handled properly by the validation_pool.
Thanks.  Somehow my workflow is really great at surfacing weird bugs.
Project Member

Comment 8 by bugdroid1@chromium.org, Jul 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/29b0f229d4c0615eff960af0821010048bfed2fc

commit 29b0f229d4c0615eff960af0821010048bfed2fc
Author: Ningning Xia <nxia@chromium.org>
Date: Fri Jul 14 02:46:01 2017

Record trybot_cancelled action with build_id of pre-cq build.

Pre-cqs with stale patch_number may get cancelled by pre-cq-launcher.
Instead of recording the trybot_cancelled action with the build_id
of pre-cq-launcher, use the build_id of the cancelled pre-cq. A
follow-up CL is to change the pre-cq failure triaging logic: when a
pre-cq reaches the CompletionStage and trys to triage the failures,
if it finds a trybot_cancelled action assicated with its build_id, it
will consider the failure as infra_only failure and will not blame
on CLs being tested.

BUG= chromium:738179 
TEST=unit_tests

Change-Id: I8920687d5c6033dd386d62da3f78124e9971cd29
Reviewed-on: https://chromium-review.googlesource.com/562655
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/29b0f229d4c0615eff960af0821010048bfed2fc/cbuildbot/stages/sync_stages_unittest.py
[modify] https://crrev.com/29b0f229d4c0615eff960af0821010048bfed2fc/cbuildbot/stages/sync_stages.py

Project Member

Comment 9 by bugdroid1@chromium.org, Jul 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/782342ed074772d7d06d02249a667ce14f1db62b

commit 782342ed074772d7d06d02249a667ce14f1db62b
Author: Ningning Xia <nxia@chromium.org>
Date: Fri Jul 14 02:46:01 2017

Do not add CLs to suspect candidate if the Pre-CQ was cancelled.

Pre-CQ can be cancelled because its patch number is stale, in which case
the Pre-CQ shouldn't blame on the CLs it's testing.

BUG= chromium:738179 
TEST=unit_tests

Change-Id: I80fb189155521addf238c7a96a770ae79620b1b9
Reviewed-on: https://chromium-review.googlesource.com/563488
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Ningning Xia <nxia@chromium.org>

[modify] https://crrev.com/782342ed074772d7d06d02249a667ce14f1db62b/cbuildbot/validation_pool.py
[modify] https://crrev.com/782342ed074772d7d06d02249a667ce14f1db62b/lib/clactions_unittest.py
[modify] https://crrev.com/782342ed074772d7d06d02249a667ce14f1db62b/cbuildbot/validation_pool_unittest.py
[modify] https://crrev.com/782342ed074772d7d06d02249a667ce14f1db62b/lib/clactions.py

Comment 10 by nxia@chromium.org, Aug 18 2017

Status: Fixed (was: Started)

Sign in to add a comment