New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 673906 link

Starred by 0 users

Issue metadata

Status: Archived
Owner:
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

signer pre-cq timing out

Project Member Reported by vapier@chromium.org, Dec 13 2016

Issue description

example CL:
https://chrome-internal-review.googlesource.com/310964

that says everything timed out within 30min.

but looking at the pre-cq, it seems to be spinning up & finishing:
https://luci-milo.appspot.com/buildbot/chromiumos.tryserver/pre_cq/10623

but maybe it's too slow ?
 
Cc: akes...@chromium.org
Owner: nxia@chromium.org
The error is that the PreCQ builder didn't start at all within 30 minutes. That should have nothing to do with how long it takes the build to finish.

Possibilities:
* The PreCQ builders were overloaded, so the was waiting in the queue for more than 30 minutes.
* The buildbucket code has some form of bug that prevented it from being notified that the build had started.
* We are still using CIDB for that 30 minute timer, and it was broken for some reason.
I strongly suspect this wasn't specific to the CL in question being a signer CL, but I'm unaware of anyone else seeing these problems.

Comment 3 by vapier@chromium.org, Dec 14 2016

semi-related, but whenever there is a timeout that didn't really timeout, is there a way we could still post the log details ?  those bots ran and posted their logs somewhere right ?

should we increase the 30min timeout to like 60min ?  timeouts should be pretty unusual right ?

Comment 4 by nxia@chromium.org, Dec 14 2016

Buildbucket logic doesn't change the way of kicking out CL. it's specific to the pre-cq. It's not running as a pre-cq build, which would run PreCQSync and mark the CL as picked_up in CIDB. 

Comment 5 by nxia@chromium.org, Dec 14 2016

Owner: dgarr...@chromium.org
Looks like the sync type is defined here:

https://chromium-review.googlesource.com/#/c/399583/
Oh.... I didn't realize that's how it worked. How hard would it be for the PreCQ Launcher to instead update CIDB via the buildbucket status?

However, it's a bug in my builder. I can fix that.
Status: Started (was: Unconfirmed)
Project Member

Comment 8 by bugdroid1@chromium.org, Dec 21 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/2ee84022facf264e379cda29643eb322aa42f584

commit 2ee84022facf264e379cda29643eb322aa42f584
Author: Don Garrett <dgarrett@google.com>
Date: Fri Dec 16 02:09:20 2016

SignerTestsBuilder: Use PreCQSyncStage.

Using the ManifestVersionedSyncStage breaks PreCQ builders, since they
don't correctly report to the PreCQ Launcher that they have started.

BUG= chromium:673906 
TEST=Unittests
     PreCQ run with test signer CL.

Change-Id: Ibc36cc9c5e04ea6ad9d5079c22288b69ee76f6b2
Reviewed-on: https://chromium-review.googlesource.com/421106
Commit-Ready: Mike Frysinger <vapier@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/2ee84022facf264e379cda29643eb322aa42f584/cbuildbot/builders/test_builders.py

Comment 9 by vapier@chromium.org, Dec 22 2016

does a waterfall need to be kicked or something ?  still failing :(

https://chrome-internal-review.googlesource.com/304864
If the understanding of the failure was correct, that should have just worked.

Sadly, I had no way to verify the fix until after it landed.
Cc: nxia@chromium.org

Comment 12 by nxia@chromium.org, Jan 10 2017

The signer-pre-cq was triggered but failed. 

mysql> select * from clActionTable where change_number=304864 and change_source='internal';

This gives us the buildbucket_id: 8992689057763377696 and 8992596005818156880. 


https://apis-explorer.appspot.com/apis-explorer/?base=https://cr-buildbucket.appspot.com/_ah/api#p/buildbucket/v1/buildbucket.get

We can find builds given the buildbucket_ids in Buildbucket.Get

https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromiumos.tryserver%2Fpre_cq%2F11874%2F%2B%2F%2A%2A%2Fstdout&s=chromeos%2Fbb%2Fchromiumos.tryserver%2Fpre_cq%2F11874%2F%2B%2F%2A%2A%2Fstderr

22:50:11: INFO: Waiting for ts_mon flushing process to finish...
cbuildbot: Unhandled exception:
Traceback (most recent call last):
  File "/b/build/slave/pre_cq/build/chromite/bin/cbuildbot", line 168, in <module>
    DoMain()
  File "/b/build/slave/pre_cq/build/chromite/bin/cbuildbot", line 164, in DoMain
    commandline.ScriptWrapperMain(FindTarget)
  File "/b/build/slave/pre_cq/build/chromite/lib/commandline.py", line 834, in ScriptWrapperMain
    ret = target(argv[1:])
  File "/b/build/slave/pre_cq/build/chromite/scripts/cbuildbot.py", line 1310, in main
    logging.info('One stage exited early: %s', ex)
  File "/b/build/slave/pre_cq/build/chromite/scripts/cbuildbot.py", line 1307, in main
    _RunBuildStagesWrapper(options, site_config, build_config)
  File "/b/build/slave/pre_cq/build/chromite/scripts/cbuildbot.py", line 248, in _RunBuildStagesWrapper
    if not builder.Run():
  File "/b/build/slave/pre_cq/build/chromite/cbuildbot/builders/generic_builders.py", line 297, in Run
    sync_instance = self.GetSyncInstance()
  File "/b/build/slave/pre_cq/build/chromite/cbuildbot/builders/test_builders.py", line 80, in GetSyncInstance
    return self._GetStageInstance(sync_stages.PreCQSyncStage)
  File "/b/build/slave/pre_cq/build/chromite/cbuildbot/builders/generic_builders.py", line 74, in _GetStageInstance
    return stage(builder_run, *args, **kwargs)
TypeError: __init__() takes exactly 3 arguments (2 given)
Owner: nxia@chromium.org

Comment 14 by nxia@chromium.org, Jan 17 2017

Owner: dgarr...@chromium.org
Project Member

Comment 15 by bugdroid1@chromium.org, Jan 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/663402205c51fa7f7adf1a6263aa90f758301bbf

commit 663402205c51fa7f7adf1a6263aa90f758301bbf
Author: Don Garrett <dgarrett@google.com>
Date: Thu Jan 26 23:25:44 2017

SignerTestsBuilder: Pass in patch_pool patches.

The PreCQSyncStage needs to have a list of patches explicitly passed
to it. We may still need a completion stage as well, not certain.

BUG= chromium:673906 
TEST=bin/cbuildbot -g *321186 --nobootstrap --noreexec \
                   --buildbot --debug --buildroot <dir> signer-pre-cq

Change-Id: I9f14595d4b5126550e34162581e359d9f325282f
Reviewed-on: https://chromium-review.googlesource.com/433843
Reviewed-by: Mike Frysinger <vapier@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/663402205c51fa7f7adf1a6263aa90f758301bbf/cbuildbot/builders/test_builders.py

Project Member

Comment 16 by bugdroid1@chromium.org, Feb 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/efe6b77d931266c9c621fe7a4efbe9f92425f3c2

commit efe6b77d931266c9c621fe7a4efbe9f92425f3c2
Author: Don Garrett <dgarrett@google.com>
Date: Fri Feb 10 04:48:47 2017

SignerTestBuilder: Add PreCQCompletionStage.

Start calling the PreCQCompletion stage to report our build results.

BUG= chromium:673906 
TEST=run_tests

Change-Id: Idefd2569f8fd527f42bf8aca4cee7f6e99c2e752
Reviewed-on: https://chromium-review.googlesource.com/438764
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/efe6b77d931266c9c621fe7a4efbe9f92425f3c2/cbuildbot/builders/test_builders.py

Status: Fixed (was: Started)
works now.  thanks Don!
Now that we have a working template, I hope to add more specialized PreCQ builders. 
Project Member

Comment 19 by bugdroid1@chromium.org, Feb 11 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/0ed79dd70a23f2b191c31c094dc13f96bf2bd032

commit 0ed79dd70a23f2b191c31c094dc13f96bf2bd032
Author: Mike Frysinger <vapier@chromium.org>
Date: Sat Feb 11 08:35:22 2017

cbuildbot: run network tests in signer precq

Since the network based tests in the signer are not onerous, run them
as part of the pre-cq.  Since they'll be run in parallel with the rest
of the big tests, it shouldn't slow things down either.

BUG= chromium:673906 
TEST=ran the signer precq config against this

Change-Id: I6b6fbebaca8cc7845e03f0325ea86f2e6519480a
Reviewed-on: https://chromium-review.googlesource.com/439924
Commit-Ready: Mike Frysinger <vapier@chromium.org>
Tested-by: Mike Frysinger <vapier@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/0ed79dd70a23f2b191c31c094dc13f96bf2bd032/cbuildbot/stages/test_stages.py
[modify] https://crrev.com/0ed79dd70a23f2b191c31c094dc13f96bf2bd032/cbuildbot/commands.py
[modify] https://crrev.com/0ed79dd70a23f2b191c31c094dc13f96bf2bd032/cbuildbot/builders/test_builders.py

Comment 20 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 21 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 23 by dchan@chromium.org, Oct 14 2017

Status: Archived (was: Fixed)

Sign in to add a comment