New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 593089 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocking:
issue 591628



Sign in to add a comment

Identify PFQ failures due to master timeouts

Project Member Reported by steve...@chromium.org, Mar 8 2016

Issue description

If a PFQ run is delayed or takes an unusually long time, it may reach a hard deadline imposed by the PFQ master.

The symptom for this is simply "ERROR: Timoeout occured- waited X seconds, failing" in whichever stage(s) are running.

We should identify the actual cause of failure and make sure that is what gets reported to the PFQ master as the cause.

 
Thanks, beat me to it!

Note: that master run had several such failures because it appears to have been started while the previous run was in progress, so several builders had to complete the previous run before starting the new run (I think).

We have seen this symptom in the past, but not recently. I will keep an eye out for better examples.

Owner: akes...@chromium.org
Status: Started (was: Assigned)
Project Member

Comment 4 by bugdroid1@chromium.org, Mar 9 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/446f07f5f4affde7a683127f7c6bef2b9208e697

commit 446f07f5f4affde7a683127f7c6bef2b9208e697
Author: Aviv Keshet <akeshet@chromium.org>
Date: Tue Mar 08 19:32:31 2016

timeout_util: log when timeouts were due to master deadline

BUG= chromium:593089 
TEST=unit tests

Change-Id: I753ea798fec1e5d1a60044f10834e26123a54b61
Reviewed-on: https://chromium-review.googlesource.com/331680
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/446f07f5f4affde7a683127f7c6bef2b9208e697/scripts/cbuildbot.py
[modify] https://crrev.com/446f07f5f4affde7a683127f7c6bef2b9208e697/lib/timeout_util.py

Can we consider this fixed after #4? Or further work has to be done?
Feel free to resolve this as fixed if it is currently reasonably addressed. If there is more work that we should do, add a comment here to that effect. If there is more work that we -could- do but that is lower priority, we should file that separately and resolve this.

Labels: Build-PFQ-Failures
Status: Fixed (was: Started)

Comment 9 by dchan@google.com, Jan 21 2017

Labels: VerifyIn-57

Comment 10 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 11 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 12 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61
Status: Verified (was: Fixed)
Closing. Please reopen it if its not fixed. Thanks!

Sign in to add a comment