New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 748145 link

Starred by 10 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug

Blocked on: View detail
issue 739899
issue 761077



Sign in to add a comment

Android test flakes seem to have increased since switching to single-device swarming

Project Member Reported by bpastene@chromium.org, Jul 24 2017

Issue description

https://build.chromium.org/p/chromium.linux/builders/Android%20Tests has seen better/greener days.

Sometimes it's device issues:
https://chromium-swarm.appspot.com/task?id=378dd67c66a20510

Sometimes it's filesystem errors: (bug 739899)
https://chromium-swarm.appspot.com/task?id=378c83cc19db0b10

Sometimes it's a missing shard:
https://chromium-swarm.appspot.com/task?id=377ec8dc1ce96810

I'm thinking these have always been issues, but the ability to fallback on to additional devices have masked/hidden the failures. Now that these tests only have a single device, the failures are becoming more obvious.

I'll start digging.
 
 Issue 747748  has been merged into this issue.
Project Member

Comment 2 by bugdroid1@chromium.org, Jul 24 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/b524f20e908b6a822b7f5082337272ac2de23578

commit b524f20e908b6a822b7f5082337272ac2de23578
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Mon Jul 24 19:32:10 2017

Issue 748670 has been merged into this issue.

Comment 4 by boliu@chromium.org, Jul 26 2017

this seemed to have gotten worse in the last day. 4 straight failures on linux_android_rel_ng here: https://chromium-review.googlesource.com/c/583852

testSnapScroll_condensedLayout timeout 3 out of 4 times. And I see the same test timeout from other CLs as well, eg: https://build.chromium.org/p/tryserver.chromium.android/builders/linux_android_rel_ng/builds/348283 So I'm guessing it's not caused my CL.

This was fine yesterday pacific time, so did something change since then? Is there any point disabling that test, or this is some other general problem?
Project Member

Comment 5 by bugdroid1@chromium.org, Jul 26 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/e82d9cace47b9bfc9a08408cbdd3667e9d0f5954

commit e82d9cace47b9bfc9a08408cbdd3667e9d0f5954
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Wed Jul 26 23:30:39 2017

Nothing really changed in the past 24 hours. We've got a change coming down the pipe (https://chromium-review.googlesource.com/c/585866/) that should help avoid the failures you saw on that cl. But if it doesn't improve the situation, then I'll proceed to start reverting things back to the state they were in a couple weeks ago.

Though, I can't speak to testSnapScroll_condensedLayout. The type of flake in question here doesn't seem to favor any particular test.
This seems to be really bad now, to the point where it's producing far more noise than signal:

https://build.chromium.org/p/chromium.linux/builders/Android%20Tests%20%28dbg%29?numbuilds=50

Comment 8 by awdf@chromium.org, Jul 27 2017

Cc: awdf@chromium.org
Yikes; that dbg bot does look pretty bad. Though, a good chunk of the failures are the missing tests issue that should be fixed once https://chromium-review.googlesource.com/c/585866/ lands.
Project Member

Comment 10 by bugdroid1@chromium.org, Jul 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/76c3c6cc2b52ca5f9909e30fc7127a9891d9585e

commit 76c3c6cc2b52ca5f9909e30fc7127a9891d9585e
Author: John Budorick <jbudorick@chromium.org>
Date: Fri Jul 28 00:26:30 2017

[android] Stop losing tests in single-device shards.

We're currently losing some tests in single-device shards when a
test times out. run_tests_on_device for that shard fails, no other
shard is available to run the remaining tests within that try, and
the tests without @RetryOnFailure don't get rerun in the subsequent
tries.

This CL attempts to address this issue by:
 1. Only killing the shard on receiving a DeviceUnreachableError or
    an exception not emitted by devil. (The former indicates that
    we probably won't be able to use the device at all in the run,
    while the latter indicates an unexpected condition that should
    likely result in termination.) This should allow the single
    device to continue running tests within a try even if a test
    times out.
 2. Always retrying tests that didn't run.

No-Tree-Checks: true
Bug: 748145
Change-Id: Id120877e78b36443ed4cf8b979f17d52ea7d59ab
Reviewed-on: https://chromium-review.googlesource.com/585866
Commit-Queue: Benjamin Pastene <bpastene@chromium.org>
Reviewed-by: Benjamin Pastene <bpastene@chromium.org>
Cr-Commit-Position: refs/heads/master@{#490159}
[modify] https://crrev.com/76c3c6cc2b52ca5f9909e30fc7127a9891d9585e/build/android/PRESUBMIT.py
[add] https://crrev.com/76c3c6cc2b52ca5f9909e30fc7127a9891d9585e/build/android/pylib/base/mock_environment.py
[add] https://crrev.com/76c3c6cc2b52ca5f9909e30fc7127a9891d9585e/build/android/pylib/base/mock_test_instance.py
[modify] https://crrev.com/76c3c6cc2b52ca5f9909e30fc7127a9891d9585e/build/android/pylib/local/device/local_device_instrumentation_test_run.py
[add] https://crrev.com/76c3c6cc2b52ca5f9909e30fc7127a9891d9585e/build/android/pylib/local/device/local_device_instrumentation_test_run_test.py
[modify] https://crrev.com/76c3c6cc2b52ca5f9909e30fc7127a9891d9585e/build/android/pylib/local/device/local_device_test_run.py

Project Member

Comment 11 by bugdroid1@chromium.org, Jul 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/cc1ba99f6a33d33bafc0e69ddb16d428af000635

commit cc1ba99f6a33d33bafc0e69ddb16d428af000635
Author: John Budorick <jbudorick@chromium.org>
Date: Fri Jul 28 03:43:29 2017

[android] Use crash_handler for all test setup steps.

Bug: 748145
Change-Id: I448afd63bf1e158a14e8e15b370b89b3310d48e0
Reviewed-on: https://chromium-review.googlesource.com/590574
Commit-Queue: John Budorick <jbudorick@chromium.org>
Reviewed-by: Benjamin Pastene <bpastene@chromium.org>
Cr-Commit-Position: refs/heads/master@{#490247}
[modify] https://crrev.com/cc1ba99f6a33d33bafc0e69ddb16d428af000635/build/android/pylib/local/device/local_device_gtest_run.py
[modify] https://crrev.com/cc1ba99f6a33d33bafc0e69ddb16d428af000635/build/android/pylib/local/device/local_device_instrumentation_test_run.py

Cc: dpranke@chromium.org
 Issue 749333  has been merged into this issue.
Project Member

Comment 13 by bugdroid1@chromium.org, Jul 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4275e8e54a6e01c21f5b1fc911246702a9d86b45

commit 4275e8e54a6e01c21f5b1fc911246702a9d86b45
Author: Brian Sheedy <bsheedy@chromium.org>
Date: Fri Jul 28 17:51:26 2017

Revert "[android] Use crash_handler for all test setup steps."

This reverts commit cc1ba99f6a33d33bafc0e69ddb16d428af000635.

Reason for revert: Looks to be causing failures on the Pixel swarming devices https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.fyi%2FAndroid_VR_Tests%2F10155%2F%2B%2Frecipes%2Fsteps%2Fchrome_public_test_vr_apk-marlin-ddview-nougat_on_Android%2F0%2Fstdout

Original change's description:
> [android] Use crash_handler for all test setup steps.
> 
> Bug: 748145
> Change-Id: I448afd63bf1e158a14e8e15b370b89b3310d48e0
> Reviewed-on: https://chromium-review.googlesource.com/590574
> Commit-Queue: John Budorick <jbudorick@chromium.org>
> Reviewed-by: Benjamin Pastene <bpastene@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#490247}

TBR=mikecase@chromium.org,bpastene@chromium.org,jbudorick@chromium.org

Change-Id: I45e5bbef5cf398e2c2f30f5e57661961048155af
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: 748145
Reviewed-on: https://chromium-review.googlesource.com/592008
Reviewed-by: John Budorick <jbudorick@chromium.org>
Commit-Queue: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#490455}
[modify] https://crrev.com/4275e8e54a6e01c21f5b1fc911246702a9d86b45/build/android/pylib/local/device/local_device_gtest_run.py
[modify] https://crrev.com/4275e8e54a6e01c21f5b1fc911246702a9d86b45/build/android/pylib/local/device/local_device_instrumentation_test_run.py

Project Member

Comment 14 by bugdroid1@chromium.org, Jul 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f87dd0a14beb2491c38afbdcf34dc64c3457bcbb

commit f87dd0a14beb2491c38afbdcf34dc64c3457bcbb
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Sat Jul 29 02:12:37 2017

Raise timeout of content_unittests on android.

Times out occasionally:
https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=user&c=bot&c=duration&et=1501263180000&f=buildername%3AAndroid%20Tests&f=name%3Acontent_unittests&l=50&q=dur&s=created_ts%3Adesc&st=1501176780000

Bug: 748145
Change-Id: Iec48cd07b7824bf3f99806eeb025294f3da3c392
Reviewed-on: https://chromium-review.googlesource.com/591633
Reviewed-by: John Budorick <jbudorick@chromium.org>
Commit-Queue: Benjamin Pastene <bpastene@chromium.org>
Cr-Commit-Position: refs/heads/master@{#490582}
[modify] https://crrev.com/f87dd0a14beb2491c38afbdcf34dc64c3457bcbb/testing/buildbot/chromium.linux.json

Comment 15 by kbr@chromium.org, Jul 29 2017

Cc: kbr@chromium.org
linux_android_rel_ng has become very unreliable. On this patch set:

https://chromium-review.googlesource.com/c/592713/2/

linux_android_rel_ng failed three times in a row:

https://luci-milo.appspot.com/buildbot/tryserver.chromium.android/linux_android_rel_ng/351042
https://luci-milo.appspot.com/buildbot/tryserver.chromium.android/linux_android_rel_ng/351069
https://luci-milo.appspot.com/buildbot/tryserver.chromium.android/linux_android_rel_ng/351084

In the first run:

https://chromium-swarm.appspot.com/task?id=37a4fa23c016cf10&refresh=10&show_raw=1

all devices (e.g., the single device) were blacklisted due to errors.

In the second run:

https://chromium-swarm.appspot.com/task?id=37a52abe699d9c10&refresh=10&show_raw=1

the isolate failed to download:

...
  File "/b/swarming/swarming_bot.1.zip/client/isolateserver.py", line 1237, in write
    size = file_write(path, content)
  File "/b/swarming/swarming_bot.1.zip/client/isolateserver.py", line 135, in file_write
    for d in content_generator:
  File "/b/swarming/swarming_bot.1.zip/client/isolateserver.py", line 870, in run
    self._inspect_chunk(stored, is_last=True)
  File "/b/swarming/swarming_bot.1.zip/client/isolateserver.py", line 884, in _inspect_chunk
    self.expected_size, self.current_size))
IOError: Incorrect file size: expected 811514680, got 277983733
Incorrect file size: expected 811514680, got 277983733

and in the third run:

https://chromium-swarm.appspot.com/task?id=37a5585d10621310&refresh=10&show_raw=1

it looks like it went into an infinite loop while setting up the device.

Thanks for the report Ken. Things have gotten a bit better after the band-aids above, and the main waterfall bots have gotten a bit greener, but there's clearly still work to do here.

The "Incorrect file size" errors are quite prominent on both android CQ bots at the moment, and I don't believe they're related single device setup. Given that these errors only show up on golo bots, not any of our b-lab bots, I believe it's related to the golo-to-googlestorage network issues we've been seeing lately (bug 748261 and b/64220188). Though disruptive, that issue is out of scope here and is being worked on elsewhere.

The first error is a known issue and is being worked on in bug 739899, and the third error is due to faulty bot that's failing a good amount of its tests: https://chromium-swarm.appspot.com/bot?id=build292-m1--device7
I'll reflash or pull that bot out of the pool.
Project Member

Comment 17 by bugdroid1@chromium.org, Aug 1 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/74e536005469d44c79e9de94beb09c039fbb826b

commit 74e536005469d44c79e9de94beb09c039fbb826b
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Tue Aug 01 21:16:14 2017

android: Rename device data dir before removing it in gtests.

Might help with device flakes like:
https://chromium-swarm.appspot.com/task?id=37320a4199090410

Bug: 739899, 748145
Change-Id: I953d60aac73194243b5ae4fd9d875c7833977a1d
Reviewed-on: https://chromium-review.googlesource.com/574872
Commit-Queue: Benjamin Pastene <bpastene@chromium.org>
Reviewed-by: Michael Case <mikecase@chromium.org>
Reviewed-by: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#491113}
[modify] https://crrev.com/74e536005469d44c79e9de94beb09c039fbb826b/build/android/pylib/local/device/local_device_gtest_run.py

Issue 752989 has been merged into this issue.
Project Member

Comment 19 by bugdroid1@chromium.org, Aug 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c1cd4bbaea7fa1e511820eb3a0a2435dbf8b4e77

commit c1cd4bbaea7fa1e511820eb3a0a2435dbf8b4e77
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Mon Aug 07 23:26:19 2017

Raise timeout of chrome_sync_shell and content_shell test apks.

TBR=jbudorick@chromium.org

Bug: 748145
Change-Id: Ie0b982143fdd05114b977a2135eacb2f3e739d26
Reviewed-on: https://chromium-review.googlesource.com/604327
Reviewed-by: Benjamin Pastene <bpastene@chromium.org>
Reviewed-by: John Budorick <jbudorick@chromium.org>
Commit-Queue: Benjamin Pastene <bpastene@chromium.org>
Cr-Commit-Position: refs/heads/master@{#492441}
[modify] https://crrev.com/c1cd4bbaea7fa1e511820eb3a0a2435dbf8b4e77/testing/buildbot/chromium.linux.json

Cc: bpastene@chromium.org jbudorick@chromium.org
 Issue 753046  has been merged into this issue.
Project Member

Comment 21 by bugdroid1@chromium.org, Aug 17 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/d62f13e07dadd6ad32dc4269ceea1535666f2046

commit d62f13e07dadd6ad32dc4269ceea1535666f2046
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Thu Aug 17 22:14:16 2017

Project Member

Comment 22 by bugdroid1@chromium.org, Aug 17 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4e4ef896247fc391dc426c67a41d6e66e97894de

commit 4e4ef896247fc391dc426c67a41d6e66e97894de
Author: John Budorick <jbudorick@chromium.org>
Date: Thu Aug 17 22:18:59 2017

[android] Use crash_handler for all test setup steps. (RELAND)

Reland of https://chromium-review.googlesource.com/c/590574/

Bug: 748145
Change-Id: I16113bf9824bd9d7d99ca27b6da66604595f12de
Reviewed-on: https://chromium-review.googlesource.com/591992
Commit-Queue: John Budorick <jbudorick@chromium.org>
Reviewed-by: Benjamin Pastene <bpastene@chromium.org>
Cr-Commit-Position: refs/heads/master@{#495348}
[modify] https://crrev.com/4e4ef896247fc391dc426c67a41d6e66e97894de/build/android/pylib/local/device/local_device_gtest_run.py
[modify] https://crrev.com/4e4ef896247fc391dc426c67a41d6e66e97894de/build/android/pylib/local/device/local_device_instrumentation_test_run.py

Project Member

Comment 23 by bugdroid1@chromium.org, Aug 18 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/f9054afb5a29996169b723f6d73a4e94d243cb91

commit f9054afb5a29996169b723f6d73a4e94d243cb91
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Fri Aug 18 00:42:34 2017

Comment 24 by awdf@chromium.org, Aug 18 2017

Cc: -awdf@chromium.org

Comment 25 by kbr@chromium.org, Aug 18 2017

Is the failure mode in:
https://luci-milo.appspot.com/buildbot/tryserver.chromium.android/android_n5x_swarming_rel/245808

where content_browsertests timed out apparently because the device was offline, being worked on?

Project Member

Comment 26 by bugdroid1@chromium.org, Aug 18 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/fec2f4f407f6bc3145db2bcaca4ea694ce6a414e

commit fec2f4f407f6bc3145db2bcaca4ea694ce6a414e
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Fri Aug 18 23:12:19 2017

Re #25: That specific failure isn't being worked on here since that's one of the few remaining Android tests that I haven't switched over to single-device tasks. And it's not entirely clear if it is due to device failure since the log gets cutout. (Though what logs there are do indicate device faults.) Looking through the server logs, it looks like the network flaked and much of the data was dropped when updating the test. Not sure what went wrong there, but it's out of scope here.
Project Member

Comment 28 by bugdroid1@chromium.org, Aug 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2a7c210ac386463e01bed216d3fbb2bc4f531c04

commit 2a7c210ac386463e01bed216d3fbb2bc4f531c04
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Sat Aug 19 06:26:54 2017

Roll src/third_party/catapult/ d2ffc23f1..344aee7e1 (4 commits)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/d2ffc23f1e05..344aee7e135a

$ git log d2ffc23f1..344aee7e1 --date=short --no-merges --format='%ad %ae %s'
2017-08-18 benjhayden Merge iteration_helpers into utils.html.
2017-08-18 benjhayden Remove unused tr.b.addSingletonGetter.
2017-08-18 benjhayden Improve error message in tr.b.Unit.fromJSON
2017-08-18 bpastene devil: Raise DeviceUnreachableError on cmd output "waiting for device"

Created with:
  roll-dep src/third_party/catapult
BUG=748145


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: Ie9132fb59cb5f69367d153bb27be8c7388bf104c
Reviewed-on: https://chromium-review.googlesource.com/622414
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#495805}
[modify] https://crrev.com/2a7c210ac386463e01bed216d3fbb2bc4f531c04/DEPS

Project Member

Comment 29 by bugdroid1@chromium.org, Aug 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/568b1c4cbe1bf9e7f9c7e366ee88c5e6bbbad8d2

commit 568b1c4cbe1bf9e7f9c7e366ee88c5e6bbbad8d2
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Wed Aug 30 18:49:37 2017

android: Run GetTests for instrumentation tests inside a crash handler.

Methinks it woulda helped with
https://chromium-swarm.appspot.com/task?id=3848331f8600a910

R=jbudorick@chromium.org

Bug: 748145
Change-Id: Id5d77a1e27d59c4398e30a686cdb2af583dec093
Reviewed-on: https://chromium-review.googlesource.com/642493
Reviewed-by: John Budorick <jbudorick@chromium.org>
Commit-Queue: Benjamin Pastene <bpastene@chromium.org>
Cr-Commit-Position: refs/heads/master@{#498551}
[modify] https://crrev.com/568b1c4cbe1bf9e7f9c7e366ee88c5e6bbbad8d2/build/android/pylib/local/device/local_device_instrumentation_test_run.py

Project Member

Comment 30 by bugdroid1@chromium.org, Aug 31 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/luci/python-adb/+/9717f0aebef872abb6758d54e58be46274032fc0

commit 9717f0aebef872abb6758d54e58be46274032fc0
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Thu Aug 31 00:48:14 2017

python-adb: Fallback to /proc/sysrq-trgger when rebooting fails.

Bug: 748145
Change-Id: Ic07decade82f7b5350729e69169c9ee2c42b8a57
Reviewed-on: https://chromium-review.googlesource.com/644088
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>

[modify] https://crrev.com/9717f0aebef872abb6758d54e58be46274032fc0/adb/contrib/high.py
[modify] https://crrev.com/9717f0aebef872abb6758d54e58be46274032fc0/adb/contrib/adb_commands_safe.py

Blockedon: 761077
Project Member

Comment 32 by bugdroid1@chromium.org, Aug 31 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/bd372583ba86027d36c9ae8f9da94c8979c586f7

commit bd372583ba86027d36c9ae8f9da94c8979c586f7
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Thu Aug 31 19:55:27 2017

#33/34: I don't think that's related to this but will look into it. My suspicion is that there's an issue w/ the recipe, not w/ the single device.
yeah, this definitely isn't related to single-device:

Traceback (most recent call last):
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/swarming.py", line 637, in yield_results
    timeout=STATUS_UPDATE_INTERVAL)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/utils/threading_utils.py", line 791, in wrapped
    self.send_result(task(*args, **kwargs))
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/swarming.py", line 621, in <lambda>
    task_fn = lambda *args: (shard_index, retrieve_results(*args))
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/swarming.py", line 513, in retrieve_results
    result = net.url_read_json(result_url, retry_50x=False)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/utils/net.py", line 177, in url_read_json
    return service.json_request(urlpath, **kwargs)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/utils/net.py", line 545, in json_request
    urlpath, content_type=content_type, data=data, stream=False, **kwargs)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/utils/net.py", line 464, in request
    response = self.engine.perform_request(request)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/utils/net.py", line 759, in perform_request
    allow_redirects=request.follow_redirects)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/adapters.py", line 376, in send
    timeout=timeout
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
    body=body, headers=headers)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/packages/urllib3/connectionpool.py", line 345, in _make_request
    self._validate_conn(conn)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/packages/urllib3/connectionpool.py", line 784, in _validate_conn
    conn.connect()
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/packages/urllib3/connection.py", line 259, in connect
    cert = self.sock.getpeercert()
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/packages/urllib3/contrib/pyopenssl.py", line 251, in getpeercert
    for value in get_subj_alt_name(x509)
  File "/b/c/b/android_n5x_swarming_rel/src/tools/swarming_client/third_party/requests/packages/urllib3/contrib/pyopenssl.py", line 144, in get_subj_alt_name
    asn1Spec=general_names)
  File "/b/build/third_party/pyasn1/pyasn1/codec/ber/decoder.py", line 757, in __call__
    stGetValueDecoder, self, substrateFun
  File "/b/build/third_party/pyasn1/pyasn1/codec/ber/decoder.py", line 356, in valueDecoder
    component, head = decodeFun(head, asn1Spec)
  File "/b/build/third_party/pyasn1/pyasn1/codec/ber/decoder.py", line 757, in __call__
    stGetValueDecoder, self, substrateFun
  File "/b/build/third_party/pyasn1/pyasn1/codec/ber/decoder.py", line 417, in valueDecoder
    r.setComponentByType(effectiveTagSet, component, 0, asn1Spec is None)
  File "/b/build/third_party/pyasn1/pyasn1/type/univ.py", line 874, in setComponentByType
    idx = self._componentType.getPositionByType(tagSet)
  File "/b/build/third_party/pyasn1/pyasn1/type/namedtype.py", line 68, in getPositionByType
    raise error.PyAsn1Error('Type %s not found' % (tagSet,))
PyAsn1Error: Type TagSet(Tag(tagClass=128, tagFormat=0, tagId=2)) not found

(The recipe is hitting an issue because of that.)
Yeah, that's bug 761414 (fixed)

Comment 38 Deleted

I don't know if it is the same issue still, but linux_android_rel_ng has caused multiple CQ failures for unrelated changes over the past few weeks or more. Looking at the history it seems to fail at least 10% of its runs, and it seems unlikely that there are anywhere close to that many actual failures.


Re #39: It might be related. There are still some small issues I'm working through. Though 10% seems pretty high. More than likely that's test flake, not infra flake. In that case it would be prudent to enable retryOnFailure for any test that fails if it's not already.

And there's recently been a good chunk of failures due to a change of mine: https://chromium-review.googlesource.com/c/catapult/+/754236. Revert in cq
Issue 782624 has been merged into this issue.
Components: Infra>Client>Chrome
Moving Infra>Client>Android -> Infra>Client>Chrome+OS=Android
Components: -Infra>Client>Android
Project Member

Comment 44 by bugdroid1@chromium.org, Dec 4 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/c89c13c04cc91dd154ba1d59f71425e8ef41b6df

commit c89c13c04cc91dd154ba1d59f71425e8ef41b6df
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Mon Dec 04 18:25:31 2017

Project Member

Comment 45 by bugdroid1@chromium.org, Dec 13 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/d509fad0504eebe6eee586bb226808a8e0ee3a92

commit d509fad0504eebe6eee586bb226808a8e0ee3a92
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Wed Dec 13 20:12:32 2017

Comment 46 by kbr@chromium.org, Dec 21 2017

Status: Started (was: Assigned)
Situation where a bot had no devices attached by the time the test started to run:
https://ci.chromium.org/buildbot/tryserver.chromium.android/linux_android_rel_ng/456935

https://chromium-swarm.appspot.com/task?id=3a8e97f8d6200310&refresh=10&show_raw=1

Traceback (most recent call last):
...
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 2692, in HealthyDevices
    return _get_devices()
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 2679, in _get_devices
    raise device_errors.NoDevicesError()
NoDevicesError: No devices attached.


I think the shard should be auto-retried in this situation, assuming it can be detected reliably.

Let me take the liberty of marking this Started.

Sign in to add a comment