New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 671377 link

Starred by 2 users

Issue metadata

Status: Duplicate
Merged: issue 665492
Owner:
OOO until 2019-01-24
Closed: Dec 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug

Blocking:
issue 596622



Sign in to add a comment

Swarming job on Android completed successfully but collection of results timed out

Project Member Reported by kbr@chromium.org, Dec 5 2016

Issue description

In this tryjob:
https://build.chromium.org/p/tryserver.chromium.android/builders/android_optional_gpu_tests_rel/builds/1243

which ran this Swarming job on the Nexus 5X pool:
https://chromium-swarm.appspot.com/task?id=32e963e03b9b7510&refresh=10&show_raw=1

the job completed successfully, but per the logs from the recipe, the collection step timed out.

There are a couple of instances of this on https://build.chromium.org/p/tryserver.chromium.android/builders/android_optional_gpu_tests_rel?numbuilds=200 -- and, actually, these are essentially the last source of flakiness seen today per  Issue 596622 .

Are there enough logs for someone to see what happened to the collect step?


----------

python -u /b/c/b/android/src/tools/swarming_client/swarming.py collect --swarming https://chromium-swarm.appspot.com --decorate --print-status-updates --json /tmp/tmp5HILAV.json --task-output-dir /tmp/tmpGdGNrW
in dir /b/c/b/android:
@@@STEP_LINK@stdout-->stdio@https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.android%2Fandroid_optional_gpu_tests_rel%2F1243%2F%2B%2Frecipes%2Fsteps%2Fwebgl_conformance_tests__with_patch__on_Android%2F0%2Fstdout@@@
 allow_subannotations: False
 base_name: webgl_conformance_tests (with patch) on Android
 cmd: ['python', '-u', '/b/c/b/android/src/tools/swarming_client/swarming.py', 'collect', '--swarming', 'https://chromium-swarm.appspot.com', '--decorate', '--print-status-updates', '--json', '/tmp/tmp5HILAV.json', '--task-output-dir', '/tmp/tmpGdGNrW']
 cwd: /b/c/b/android
 env: {'GOMA_SERVICE_ACCOUNT_JSON_FILE': '/creds/service_accounts/service-account-goma-client.json', 'PATH': '/b/c/b/android/src/third_party/android_tools/sdk/platform-tools:/b/c/b/android/src/build/android:%(PATH)s'}
 infra_step: False
 name: webgl_conformance_tests (with patch) on Android
 nest_level: 0
 ok_ret: frozenset([0])
 step_test_data: <lambda>(...)
 trigger_specs: []
full environment:
 AWS_CREDENTIAL_FILE: /b/build/site_config/.boto
 BOTO_CONFIG: /b/build/site_config/.boto
 BUILDBOT_BLAMELIST: [u'geofflang@chromium.org']
 BUILDBOT_BRANCH: 
 BUILDBOT_BUILDBOTURL: https://build.chromium.org/p/tryserver.chromium.android/
 BUILDBOT_BUILDERNAME: android_optional_gpu_tests_rel
 BUILDBOT_BUILDNUMBER: 1243
 BUILDBOT_CLOBBER: 
 BUILDBOT_GOT_REVISION: None
 BUILDBOT_MASTERNAME: tryserver.chromium.android
 BUILDBOT_REVISION: 
 BUILDBOT_SCHEDULER: None
 BUILDBOT_SLAVENAME: slave1000-c4
 CHROME_HEADLESS: 1
 DISPLAY: :0.0
 GIT_USER_AGENT: linux2 git/2.11.0 slave1000-c4.c.chromecompute.google.com.internal
 GOMA_SERVICE_ACCOUNT_JSON_FILE: /creds/service_accounts/service-account-goma-client.json
 HOME: /home/chrome-bot
 LANG: en_US.UTF-8
 LOGDOG_STREAM_PREFIX: bb/tryserver.chromium.android/android_optional_gpu_tests_rel/1243
 LOGDOG_STREAM_PROJECT: chromium
 LOGDOG_STREAM_SERVER_PATH: unix:/b/build/rr/tmpurkoVW/butler.sock
 PAGER: cat
 PATH: /b/c/b/android/src/third_party/android_tools/sdk/platform-tools:/b/c/b/android/src/build/android:/home/chrome-bot/slavebin:/b/depot_tools:/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
 PWD: /b/build/slave/android/build
 PYTHONPATH: /b/rr/tmpeA5m8p/rw/checkout/scripts:/b/rr/tmpeA5m8p/rw/checkout/site_config:/b/rr/tmpeA5m8p/rw/checkout/third_party:/b/rr/tmpeA5m8p/rw/checkout/third_party/buildbot_8_4p1:/b/rr/tmpeA5m8p/rw/checkout/third_party/buildbot_slave_8_4:/b/rr/tmpeA5m8p/rw/checkout/third_party/coverage-3.7.1:/b/rr/tmpeA5m8p/rw/checkout/third_party/decorator_3_3_1:/b/rr/tmpeA5m8p/rw/checkout/third_party/google_api_python_client:/b/rr/tmpeA5m8p/rw/checkout/third_party/httplib2/python2:/b/rr/tmpeA5m8p/rw/checkout/third_party/infra_libs:/b/rr/tmpeA5m8p/rw/checkout/third_party/jinja2:/b/rr/tmpeA5m8p/rw/checkout/third_party/markupsafe:/b/rr/tmpeA5m8p/rw/checkout/third_party/mock-1.0.1:/b/rr/tmpeA5m8p/rw/checkout/third_party/oauth2client:/b/rr/tmpeA5m8p/rw/checkout/third_party/pyasn1:/b/rr/tmpeA5m8p/rw/checkout/third_party/pyasn1-modules:/b/rr/tmpeA5m8p/rw/checkout/third_party/python-rsa:/b/rr/tmpeA5m8p/rw/checkout/third_party/requests_2_10_0:/b/rr/tmpeA5m8p/rw/checkout/third_party/setuptools-0.6c11:/b/rr/tmpeA5m8p/rw/checkout/third_party/sqlalchemy_0_7_1:/b/rr/tmpeA5m8p/rw/checkout/third_party/sqlalchemy_migrate_0_7_1:/b/rr/tmpeA5m8p/rw/checkout/third_party/tempita_0_5:/b/rr/tmpeA5m8p/rw/checkout/third_party/twisted_10_2:/b/rr/tmpeA5m8p/rw/checkout/third_party/uritemplate:/b/rr/tmpeA5m8p/rw/checkout/third_party/site-packages:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/recipe_modules/test_results/resources:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/third_party:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/third_party/requests:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/third_party/six:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/third_party/client-py:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/third_party/mock-1.0.1:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/third_party/astunparse:/b/rr/tmpeA5m8p/rw/checkout/scripts/slave/.recipe_deps/recipe_engine:/b/build/site_config:/b/build/scripts:/b/build/scripts/release:/b/build/third_party:/b/build/third_party/requests_2_10_0:/b/build_internal/site_config:/b/build_internal/symsrc:/b/build/slave:/b/build/third_party/buildbot_slave_8_4:/b/build/third_party/twisted_10_2:/b/build/slave/android/build:/usr/lib/python2.7:/usr/lib/python2.7/plat-x86_64-linux-gnu:/usr/lib/python2.7/lib-tk:/usr/lib/python2.7/lib-old:/usr/lib/python2.7/lib-dynload
 PYTHONUNBUFFERED: 1
 TESTING_SLAVENAME: slave1000-c4
 USER: chrome-bot
 USERNAME: chrome-bot

Waiting for results from the following shards: 0

command timed out: 6900 seconds elapsed, attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=6900.007838

 

Comment 2 by kbr@chromium.org, Dec 5 2016

Blockedon: 670866
Cc: ynovikov@chromium.org
Components: Infra>Platform>Swarming
M-A indicated on Issue 670866 that there might be a race condition in the Swarming server. Could this be another symptom of the same issue?

No, compile alone took 1h17m. It waited for 26m but the task took 34m. So the global timeout is the problem here. Swarming correctly behaved.

Comment 4 by kbr@chromium.org, Dec 5 2016

Blockedon: -670866
Owner: kbr@chromium.org
Status: Started (was: Untriaged)
Ohhhhhh. Thanks, I see. I'll increase that timeout.

I'm taking care of that in  issue 665492 

Comment 6 by kbr@chromium.org, Dec 5 2016

Can I take that? I'm preparing a CL now.

That bot is in the same pool as linux_android_rel_ng and android_n5x_swarming_rel (and only those bots). Is there a reason for that?

If so, we should make sure we have capacity for longer-running builds from this bot; if not, we should move it into ccompute_optional_bots.

Comment 8 by kbr@chromium.org, Dec 5 2016

Mergedinto: 665492
Status: Duplicate (was: Started)
We mimicked the configuration of android_n5x_swarming_rel when setting up this tryserver. It runs very few jobs so I don't think it will be hogging capacity. Let me know if it seems to be a problem.

Going to duplicate this into Yuly's  Issue 665492 .

Ah. If that's the case, it's probably not a problem in practice, but it should still probably be using ccompute_optional_bots rather than the CQ pool.

Comment 10 by kbr@chromium.org, Dec 5 2016

I'd like ccompute_optional_bots to be expanded out if we're going to put android_optional_gpu_tests_rel on it. It is used by the ANGLE team and enough Chromium developers that we don't want jobs waiting for a builder to pick them up (and right now there are only 4 machines in the ccompute_optional_bots pool).

That seems reasonable enough.

Sign in to add a comment