New issue
Advanced search Search tips
Starred by 6 users
Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment
swarming.py hangs with no output until killed by builder.
Project Member Reported by x...@chromium.org, Dec 8 Back to list
It started showing up this afternoon (12/08)
See an example build:
https://uberchromegw.corp.google.com/i/chromeos.chrome/builders/tricky-tot-chrome-pfq-informational/builds/7230

Error message:
14:46:31: INFO: 
14:46:31: ERROR: BaseException in _RunParallelStages <class 'chromite.lib.parallel.ProcessSilentTimeout'>: No output from <_BackgroundTask(_BackgroundTask-7:7:2, started)> for 8640 seconds
Traceback (most recent call last):
  File "/b/c/cbuild/repository/chromite/cbuildbot/builders/generic_builders.py", line 120, in _RunParallelStages
    parallel.RunParallelSteps(steps)
  File "/b/c/cbuild/repository/chromite/lib/parallel.py", line 679, in RunParallelSteps
    return [queue.get_nowait() for queue in queues]
  File "/b/c/cbuild/repository/chromite/lib/parallel.py", line 676, in RunParallelSteps
    pass
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/b/c/cbuild/repository/chromite/lib/parallel.py", line 562, in ParallelTasks
    raise BackgroundFailure(exc_infos=errors)
BackgroundFailure: <class 'chromite.lib.parallel.ProcessSilentTimeout'>: No output from <_BackgroundTask(_BackgroundTask-7:7:2, started)> for 8640 seconds

 
Seems the error still shows up on the latest lumpy-chrome-pfq: https://luci-milo.appspot.com/buildbot/chromeos/lumpy-chrome-pfq/11085
Cc: deanliao@chromium.org dtor@chromium.org ecgh@chromium.org
Owner: dgarr...@chromium.org
adding this week's folks. dgarrett: is this an infra issue? I'm not sure how we even know this is a "provision error", the output is pretty opaque to me.
Cc: davidri...@chromium.org
Summary: swarming.py hangs with no output until killed by builder. (was: Hwtest provision error on several chrome PFQs and informational PFQs )
Oh... no, that's not a provision error.

There is a long standing problem with swarming.py incorrectly appearing to hang.


Owner: xixuan@chromium.org
I thought xixuan@ had fixed this previously.
Cc: nxia@chromium.org
Is this the same as issue 772985, then?
Status: Started
Re #6, yes. I will prepare a fix later today.
Issue 794125 has been merged into this issue.
Cc: dgarr...@chromium.org
 Issue 794859  has been merged into this issue.
More examples of this. It seems that we are hitting this a LOT now, and it's blocking a variety of things.
Project Member Comment 11 by bugdroid1@chromium.org, Dec 15
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/cd0856953252117831f4c4441093924d390ae14d

commit cd0856953252117831f4c4441093924d390ae14d
Author: Xixuan Wu <xixuan@chromium.org>
Date: Fri Dec 15 02:11:53 2017

cbuildbot: retry swarming commands if it's timed out.

BUG=chromium:793499
TEST=Run tryjob:
https://luci-milo.appspot.com/buildbot/chromiumos.tryserver/paladin/4811#

Change-Id: I70861c0b7f22312295ea3a71cd8b55b332168519
Reviewed-on: https://chromium-review.googlesource.com/823617
Commit-Ready: Xixuan Wu <xixuan@chromium.org>
Tested-by: Xixuan Wu <xixuan@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/cd0856953252117831f4c4441093924d390ae14d/cbuildbot/swarming_lib.py

Hopefully, this is now fixed?
I hope so... I will wait for some days for more reports if there's any...
Sign in to add a comment