New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 761299 link

Starred by 2 users

Issue metadata

Status: Duplicate
Owner:
Closed: Sep 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

Uncaught Exception on Android tester bots due to swarming errors

Project Member Reported by perezju@chromium.org, Sep 1 2017

Issue description

Uncaught Exception failing on 6 builders

Builders failed on: 
- KitKat Tablet Tester: 
  https://luci-milo.appspot.com/buildbot/chromium.android/KitKat%20Tablet%20Tester
- Lollipop Phone Tester: 
  https://luci-milo.appspot.com/buildbot/chromium.android/Lollipop%20Phone%20Tester
- Lollipop Tablet Tester: 
  https://luci-milo.appspot.com/buildbot/chromium.android/Lollipop%20Tablet%20Tester
- Marshmallow 64 bit Tester: 
  https://luci-milo.appspot.com/buildbot/chromium.android/Marshmallow%2064%20bit%20Tester
- Marshmallow Tablet Tester: 
  https://luci-milo.appspot.com/buildbot/chromium.android/Marshmallow%20Tablet%20Tester

- Android Tests (dbg)
  https://luci-milo.appspot.com/buildbot/chromium.linux/Android%20Tests%20%28dbg%29


Symptoms are some shards which "had an internal swarming failure", see e.g. "content_browsertests" on:
https://luci-milo.appspot.com/buildbot/chromium.android/Lollipop%20Phone%20Tester/14742

Although the swarming task itself looks fine, e.g.:
https://chromium-swarm.appspot.com/task?id=3854a0e82898a510&refresh=10&show_raw=1

Output ends with:
C  638.048s Main  ********************************************************************************
C  638.048s Main  Summary
C  638.048s Main  ********************************************************************************
C  638.049s Main  [==========] 320 tests ran.
C  638.049s Main  [  PASSED  ] 320 tests.
C  638.049s Main  ********************************************************************************

The traceback on the Uncaught Exception step reads:

Traceback (most recent call last):
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/run.py", line 328, in _new_run
    recipe_result = recipe_script.run(api, properties)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/loader.py", line 98, in run
    self.run_steps, properties, self.PROPERTIES, api=api)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/loader.py", line 627, in invoke_with_properties
    **additional_args)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/loader.py", line 588, in _invoke_with_properties
    return callable_obj(*props, **additional_args)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipes/chromium.py", line 51, in RunSteps
    api.chromium_tests.main_waterfall_steps()
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/recipe_api.py", line 745, in _inner
    return func(*a, **kw)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/chromium_tests/api.py", line 880, in main_waterfall_steps
    test_runner()
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/chromium_tests/api.py", line 316, in test_runner
    t.post_run(self._api_for_tests, suffix)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/chromium_tests/steps.py", line 1853, in post_run
    return self._test.post_run(api, suffix)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/chromium_tests/steps.py", line 1380, in post_run
    super(SwarmingGTestTest, self).post_run(api, suffix)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/chromium_tests/steps.py", line 1258, in post_run
    api.swarming.collect_task(self._tasks[suffix])
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/recipe_api.py", line 745, in _inner
    return func(*a, **kw)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/swarming/api.py", line 708, in collect_task
    return task.collect_step(task, **kwargs)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/swarming/api.py", line 507, in <lambda>
    self._gtest_collect_step(test_launcher_summary_output, *args, **kw))
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/recipe_api.py", line 745, in _inner
    return func(*a, **kw)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/swarming/api.py", line 911, in _gtest_collect_step
    step_result.presentation)
  File "/b/rr/tmp6_t016/rw/checkout/scripts/slave/recipe_modules/swarming/api.py", line 745, in _display_pending
    if not shard.get('started_ts'):
AttributeError: 'NoneType' object has no attribute 'get'

+martiniss in case you have seen something similar in the past
 
Labels: Pri-1 Type-Bug-Regression
Labels: -Restrict-View-Google
-RVG not sure why it was tagged like that by SoM.
Owner: perezju@chromium.org
Status: Assigned (was: Available)
I can see that the bug got introduced by a recipe side change:

https://chromium-review.googlesource.com/c/chromium/tools/build/+/642491/7/scripts/slave/recipe_modules/swarming/api.py#745

At line 745, in _display_pending it's missing the check that was there before for `shard` to actually exist.

I'll try to whip up a quick fix.
Cc: jbudorick@chromium.org
Fix is out for review:
https://chromium-review.googlesource.com/c/chromium/tools/build/+/645983

but I don't have review powers on this repo to TBR :(
Cc: iannucci@chromium.org

Comment 6 by oprypin@webrtc.org, Sep 1 2017

WebRTC's Linux try bots are affected
https://build.chromium.org/p/tryserver.webrtc/builders/
Project Member

Comment 7 by bugdroid1@chromium.org, Sep 1 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/9fa031a6b9c4adc86f632ae5af668a5509bcae7d

commit 9fa031a6b9c4adc86f632ae5af668a5509bcae7d
Author: Juan A. Navarro Perez <perezju@chromium.org>
Date: Fri Sep 01 13:22:22 2017

Fix bug in swarming/api.py

A previous CL [1] missed a test to check whether the `shard` object
actually exists.

[1]: https://chromium-review.googlesource.com/c/chromium/tools/build/+/642491/7/scripts/slave/recipe_modules/swarming/api.py#745

Bug:761299
Change-Id: Idfdc499a5cb75750a52c4daf8476add0a440d1dd
Reviewed-on: https://chromium-review.googlesource.com/645983
Commit-Queue: Juan Antonio Navarro Pérez <perezju@chromium.org>
Reviewed-by: John Budorick <jbudorick@chromium.org>

[modify] https://crrev.com/9fa031a6b9c4adc86f632ae5af668a5509bcae7d/scripts/slave/recipe_modules/swarming/examples/full.expected/isolated_script_with_null_shard.json
[modify] https://crrev.com/9fa031a6b9c4adc86f632ae5af668a5509bcae7d/scripts/slave/recipe_modules/swarming/api.py

Labels: -Pri-1 Pri-2
Lowering Pri as fix landed. Will keep an eye on these ...
Mergedinto: 761414
Status: Duplicate (was: Assigned)
Recipe fix landed, but the recipe error was only a symptom. Duping into the cause.

Sign in to add a comment