New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 844230 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug

Blocked on:
issue 879378
issue 845147
issue 869557

Blocking:
issue 918676



Sign in to add a comment

adb connection flaky during GPU tests on Android FYI Release (NVIDIA Shield TV)

Project Member Reported by jdarpinian@chromium.org, May 17 2018

Issue description

Here are two builds that failed with the same stack:
https://chromium-swarm.appspot.com/task?id=3d88f9248162c510&refresh=10&show_raw=1
https://chromium-swarm.appspot.com/task?id=3d74d62d10c6e110&refresh=10&show_raw=1

INFO:root:Try printing formatted exception: <type 'exceptions.AttributeError'> 'NoneType' object has no attribute 'platform' <traceback object at 0x7f5484c0a0e0>

Traceback (most recent call last):
  <module> at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/testing/run_browser_tests.py:359
    ret_code = RunTests(sys.argv[1:])
  RunTests at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/testing/run_browser_tests.py:328
    ret, _, _ = runner.run()
  run at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/runner.py:179
    ret, full_results = self._run_tests(result_set, test_set)
  _run_tests at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/runner.py:465
    self._run_one_set(self.stats, result_set, test_set)
  _run_one_set at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/runner.py:510
    test_set.isolated_tests, 1)
  _run_list at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/runner.py:536
    _setup_process, _teardown_process)
  make_pool at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/pool.py:28
    return _AsyncPool(host, jobs, callback, context, pre_fn, post_fn)
  __init__ at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/pool.py:188
    self.context_after_pre = pre_fn(self.host, 1, self.context)
  _setup_process at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/runner.py:806
    child.context_after_setup = child.setup_fn(child, child.context)
  _SetUpProcess at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/testing/run_browser_tests.py:349
    context.test_class.SetUpProcess()
  SetUpProcess at /b/swarming/w/ir/content/test/gpu/gpu_tests/pixel_integration_test.py:69
    cls.CustomizeBrowserArgs(cls._AddDefaultArgs([]))
  CustomizeBrowserArgs at /b/swarming/w/ir/content/test/gpu/gpu_tests/gpu_integration_test.py:54
    cls.SetBrowserOptions(cls._finder_options)
  SetBrowserOptions at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/testing/serially_executed_browser_test_case.py:71
    cls.platform = cls._browser_to_create.platform
AttributeError: 'NoneType' object has no attribute 'platform'

Locals:
  browser_options : [('android_blacklist_file', None), ('assert_gpu_compositing', None), ('browser_executable', None), ('browser_options', [('_browser_startup_timeout', 60), ('_extra_browser_args', set(['--force-color-profile=srgb', '--test-type=gpu', '--enable-gpu-benchmarking', '--js-flags=--expose-gc', '--enable-logging=stderr', '--ensure-forced-color-profile'])), ('assert_gpu_compositing', False), ('browser_type', 'android-chromium'), ('browser_user_agent_type', None), ('clear_sytem_cache_for_browser_and_profile_on_star ... r', '/b/swarming/w/ir/content/test/data/gpu/gpu_reference'), ('refimg_cloud_storage_bucket', 'chromium-gpu-archive/reference-images'), ('remote_platform_options', <telemetry.internal.platform.remote_platform_options.AndroidPlatformOptions object at 0x7f5484bf6390>), ('simpleperf_frequency', 1000), ('simpleperf_periods', []), ('simpleperf_target', ''), ('test_machine_name', 'Android FYI Release (NVIDIA Shield TV)'), ('upload_refimg_to_cloud_storage', True), ('verbosity', 1), ('webview_embedder_apk', None)] (truncated)
  cls             : <class 'gpu_tests.pixel_integration_test.PixelIntegrationTest'>
 
Cc: kbr@chromium.org

Comment 2 by mar...@chromium.org, May 17 2018

Components: -Infra>Platform>Swarming Tests>Telemetry
Swarming runs tests but swarming itself works, the test is broken.

Comment 4 by kbr@chromium.org, May 17 2018

Cc: perezju@chromium.org bpastene@chromium.org jbudorick@chromium.org nednguyen@chromium.org ynovikov@chromium.org
Components: Internals>GPU>Testing Infra>Client>Android
Labels: -Pri-3 OS-Android Pri-2
Summary: GPU tests flaky on Android FYI Release (NVIDIA Shield TV) (was: pixel_test flaky on Android FYI Release (NVIDIA Shield TV))
The problem's happening in the Telemetry harness. For some reason it's occasionally failing to find a browser to run. It's not just the pixel tests which are affected; it looks like all of the flakes on this bot are caused by this issue. Here's a failure of maps_pixel_test which is the same problem:
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/2766

and depth_capture_tests:
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/2669

In the meantime, here's a shard of depth_capture_tests from the first build above which ran successfully:
https://chromium-swarm.appspot.com/task?id=3d8592e3959c7010&refresh=10&show_raw=1

Here's a log excerpt from the start of a successful run:

INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb devices
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb devices
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s 0323317064626 shell '( ( c=/data/local/tmp/cache_token;echo $EXTERNAL_STORAGE;cat $c 2>/dev/null||echo;echo "6741a718-599a-11e8-83e9-0242ac110008">$c &&getprop )>/data/local/tmp/temp_file-46ed2aa599346 2>&1 );echo %$?'
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s 0323317064626 pull /data/local/tmp/temp_file-46ed2aa599346 /b/swarming/w/itdZZL8F/tmpUbQmir/tmp_ReadFileWithPull
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s 0323317064626 shell '( ls /root );echo %$?'
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb -s 0323317064626 shell 'rm -f /data/local/tmp/temp_file-46ed2aa599346'


and here's the same excerpt from a failed run:

INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb devices
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb devices
INFO:devil.utils.cmd_helper:[host]> /b/swarming/w/ir/third_party/catapult/devil/bin/deps/linux2/x86_64/bin/adb kill-server
INFO:root:Try printing formatted exception: <type 'exceptions.AttributeError'> 'NoneType' object has no attribute 'platform' <traceback object at 0x7fbbf7564248>

Traceback (most recent call last):
  <module> at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/testing/run_browser_tests.py:359
    ret_code = RunTests(sys.argv[1:])
  RunTests at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/testing/run_browser_tests.py:328
    ret, _, _ = runner.run()
  run at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/runner.py:179
    ret, full_results = self._run_tests(result_set, test_set)
  _run_tests at /b/swarming/w/ir/third_party/catapult/third_party/typ/typ/runner.py:465
...

bpastene, jbudorick, this looks like some sort of flakiness in Chromium's Android testing or swarming infrastructure. It used to be happening fairly often on this bot though not really any more. Do you know what the problem might be here?

I think adb fails to find devices. Bad USB connection?

Comment 6 by kbr@chromium.org, May 18 2018

Maybe, especially if it's just this bot experiencing the problem.

Would this be the failure mode if for example the device were unplugged as the test were starting?

It looks like at least four bots have had runs fail with this problem.
Sounds like browser_finder.FindBrowser is returning None here:
https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/testing/serially_executed_browser_test_case.py?rcl=ce9b3742a10dc2f4c796a1b26e73155d0bfa674a&l=69

Indeed that sounds like adb could not find any devices.

Still Telemetry should probably give a better error message when that happens in serially_executed_browser_test_case.py.
Summary: adb connection flaky during GPU tests on Android (was: GPU tests flaky on Android FYI Release (NVIDIA Shield TV))
BTW, this happens on other bots as well.
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20(Nexus%206P)/3369

I think what can be done here is reporting this as Infra failure instead of test failure, and somehow monitoring this in order to replace USB cables or some other action. Maybe it's possible to log some more info from adb or USB.
From the syslogs on the machine at the time of one of those tests:

May 17 15:12:12 build36-a1 kernel: [38329.912679] usb 3-1.4.3: device descriptor read/64, error -110
May 17 15:12:17 build36-a1 kernel: [38335.080572] usb 3-1.4.3: device descriptor read/64, error -110
May 17 15:12:17 build36-a1 kernel: [38335.256292] usb 3-1.4.3: reset high-speed USB device number 22 using ehci-pci
May 17 15:12:22 build36-a1 kernel: [38340.320346] usb 3-1.4.3: device descriptor read/64, error -110
May 17 15:12:27 build36-a1 kernel: [38345.488236] usb 3-1.4.3: device descriptor read/64, error -110
May 17 15:12:27 build36-a1 kernel: [38345.663962] usb 3-1.4.3: reset high-speed USB device number 22 using ehci-pci
May 17 15:12:38 build36-a1 kernel: [38356.055650] usb 3-1.4.3: device not accepting address 22, error -110
May 17 15:12:38 build36-a1 kernel: [38356.127537] usb 3-1.4.3: reset high-speed USB device number 22 using ehci-pci
May 17 15:12:48 build36-a1 kernel: [38366.519228] usb 3-1.4.3: device not accepting address 22, error -110
May 17 15:12:48 build36-a1 kernel: [38366.521739] usb 3-1.4.3: USB disconnect, device number 22
May 17 15:12:48 build36-a1 kernel: [38366.607090] usb 3-1.4.3: new high-speed USB device number 101 using ehci-pci
May 17 15:12:53 build36-a1 kernel: [38371.671148] usb 3-1.4.3: device descriptor read/64, error -110
May 17 15:12:59 build36-a1 kernel: [38376.839035] usb 3-1.4.3: device descriptor read/64, error -110
May 17 15:12:59 build36-a1 kernel: [38377.014755] usb 3-1.4.3: new high-speed USB device number 102 using ehci-pci
May 17 15:13:04 build36-a1 kernel: [38382.078804] usb 3-1.4.3: device descriptor read/64, error -110
May 17 15:13:09 build36-a1 kernel: [38387.246692] usb 3-1.4.3: device descriptor read/64, error -110

Yeah, that's usb subsystem preventing read/writes to the device. Error -110 is power-related I believe, ie: the system wasn't providing enough current to the device. Not sure how the nvidia shields are setup, but if they're not on an external power-supply, they prob should be.
Components: Infra>Labs
Owner: perezju@chromium.org
Status: Assigned (was: Untriaged)
perezju@, could you address adding additional logging in the telemetry script for this? Should this be reported as an infra failure?

Comment 12 by kbr@chromium.org, May 18 2018

Cc: borenet@chromium.org jo...@chromium.org
johnw@: I assume that these Shield TV units are already on an external power supply and aren't powered over USB?

borenet@: do you have any similar issues with Skia's SHIELD devices that are hooked into Swarming via Raspberry Pis?
https://chromium-swarm.appspot.com/botlist?c=id&c=os&c=task&c=status&f=pool%3ASkia&f=device_type%3ANVIDIA%20Shield%20(foster)&l=100&s=id%3Aasc

Comment 13 by jo...@google.com, May 18 2018

Correct, they each have dedicated power-supplies that are hard cabled into the rack PDU.

Comment 14 by bore...@google.com, May 18 2018

Cc: kjlubick@chromium.org
I seem to remember having a lot of problems with Nexus Players being flaky, but they've been stable for quite a while.  +Kevin to see if he remembers anything special we needed to do.
We haven't had any issues with the NVIDIA Shields, and I believe we have the same model of shields.

Do I understand it correctly that multiple shields are hooked up to 1 host and divided up into multiple swarming hosts via docker?  If so, that is where we differ.

Perhaps the USB card in the host machine or the USB hub is bad.  I know in the past (when we were on a many devices to one host model), we had some issues with some usb devices interfering with others, even with hubs.

Comment 16 by jo...@google.com, May 18 2018

For the shields, we use the standard hardware config we use with other android devices (7 devices, connected to a 7-port USB hub, connected to the host machine). 

The hub itself appears to be ok, but it's easy enough to swap that out as a precaution. The cables are non-standard in that they are male USB A-A, and we don't normally use this type. The connections themselves seem to be solid, however. 

If we suspect this is due to some kind of hardware incompatibility, there are some things we can try:

Switching between USB 2.0 vs 3.0 onboard host ports
Installing a dedicated 2.0 PCI-e USB controller
Splitting the devices between 2 hubs
etc... 

I'll start with swapping out the hub.
As I've mentioned in #9, this happens not only on Shield TV, though it does seem to happen more often there.
Owner: nednguyen@chromium.org
Ned, can you look at this one? I think the need of a better error message comes from your recent refactors on shared state.
Owner: jdarpinian@chromium.org
#18: the gpu test harness doesn't use shared state, so this is unrelated to that bug.


It's unclear to me what log need to be improved. Reassign to the bug owner to clarify the next step of this bug
Sorry, should have been more explicit. See comment #8 above.

Telemetry should fail at that point indicating that no browser was found and (if possible) exit with a return_code signaling an infra issue.
Blockedon: 845147
#20: got it. I filed  issue 845147  to address that problem
If not, maybe we can use Skia's setup for this bot, i.e. one device - one Raspberry Pi?
We are interested if this set up can work in our labs in general, and this looks like a good opportunity to experiment.
Of note, we (Skia) are working on re-arranging the RPIs and devices to get a denser layout, like: https://screenshot.googleplex.com/kWNNRBQBP8h

Comment 26 by jo...@google.com, May 23 2018

Hub has been swapped out.

Comment 28 by kbr@chromium.org, May 26 2018

We don't know for sure whether this is USB related.

Skia folks: how reliable are your Raspberry Pi-based SHIELD tablets?

This is for the Nvidia Shields (not the tablets, but the gaming "console")

In the last 2 months (Mar-25-2018 -> May 25-2018), there have been no BOT_DIED related to the RPI crashing during a task [1] and 5 COMPLETED_FAILED in which a test on an NVIDIA Shield failed [2].  Since 5 is not big, I went through and found that in 4/5 cases, the failure was an abort, likely due to a code change. 

So, 1 USB/device failure in the last 11,480 tasks.  Only 4 '9s' of reliability. :\

[1] https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=bot&c=modified_ts&c=duration&et=1527297540000&f=state%3ABOT_DIED&f=pool%3ASkia&f=device_type%3Afoster&l=50&s=created_ts%3Adesc&st=1522027080000

[2] https://chromium-swarm.appspot.com/tasklist?c=name&c=state&c=created_ts&c=bot&c=modified_ts&c=duration&et=1527297540000&f=state%3ACOMPLETED_FAILURE&f=pool%3ASkia&f=device_type%3Afoster&l=50&s=created_ts%3Adesc&st=1522027080000

Comment 30 by kbr@chromium.org, May 26 2018

kjlubick@: that's remarkable reliability. Excellent work by the entire Skia infra team. Thank you for that data.

johnw / bpastene / jbudorick: what should we try next to make these devices more reliable? If we can solve this in the context of the Labs' setup, it may improve the reliability of all of the Android devices which are hooked up to a single host.

Alternatively, since we don't have many of these Shield game consoles, if they're the most unreliable devices, then should we just take the Skia team's RPi-based solution off-the-shelf?

kbr@: What gave you the impression that these are "the most unreliable devices"? Looking at the past 100 builds on the bot, two tasks have failed due to usb/adb errors:
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29?limit=100

Given 21 tasks per build, that's 2 of 2100; ie less than 0.1% of tasks are failing due to infra errors. IMO, that seems reasonable (but feel free to disagree). Personally, I'd take issue with an effort to revamp our layout simply to get one more 9 of reliability on a single bot. (Especially given the endless stream of higher priority projects me/my team are working on).

> We don't know for sure whether this is USB related.

I'm pretty confident that it is. Looking through system logs during the most recent failure, I see the same errors mentioned in #10:

May 29 01:27:16 build36-a1 kernel: [68286.842683] usb 3-1.2: reset high-speed USB device number 104 using ehci-pci
May 29 01:27:21 build36-a1 kernel: [68291.906738] usb 3-1.2: device descriptor read/64, error -110
May 29 01:27:26 build36-a1 kernel: [68297.074632] usb 3-1.2: device descriptor read/64, error -110
May 29 01:27:27 build36-a1 kernel: [68297.250353] usb 3-1.2: reset high-speed USB device number 104 using ehci-pci
May 29 01:27:32 build36-a1 kernel: [68302.314406] usb 3-1.2: device descriptor read/64, error -110
May 29 01:27:37 build36-a1 kernel: [68307.482298] usb 3-1.2: device descriptor read/64, error -110
May 29 01:27:37 build36-a1 kernel: [68307.658023] usb 3-1.2: reset high-speed USB device number 104 using ehci-pci
May 29 01:27:47 build36-a1 kernel: [68318.049719] usb 3-1.2: device not accepting address 104, error -110
May 29 01:27:48 build36-a1 kernel: [68318.121605] usb 3-1.2: reset high-speed USB device number 104 using ehci-pci
May 29 01:27:58 build36-a1 kernel: [68328.513308] usb 3-1.2: device not accepting address 104, error -110
May 29 01:27:58 build36-a1 kernel: [68328.515715] usb 3-1.2: USB disconnect, device number 104
May 29 01:27:58 build36-a1 kernel: [68328.601165] usb 3-1.2: new high-speed USB device number 109 using ehci-pci
May 29 01:28:03 build36-a1 kernel: [68333.665224] usb 3-1.2: device descriptor read/64, error -110
May 29 01:28:08 build36-a1 kernel: [68338.833109] usb 3-1.2: device descriptor read/64, error -110
May 29 01:28:08 build36-a1 kernel: [68339.008833] usb 3-1.2: new high-speed USB device number 110 using ehci-pci
May 29 01:28:14 build36-a1 kernel: [68344.072888] usb 3-1.2: device descriptor read/64, error -110
May 29 01:28:19 build36-a1 kernel: [68349.240778] usb 3-1.2: device descriptor read/64, error -110
May 29 01:28:19 build36-a1 kernel: [68349.416501] usb 3-1.2: new high-speed USB device number 111 using ehci-pci
May 29 01:28:29 build36-a1 kernel: [68359.808200] usb 3-1.2: device not accepting address 111, error -110
May 29 01:28:29 build36-a1 kernel: [68359.880086] usb 3-1.2: new high-speed USB device number 112 using ehci-pci
May 29 01:28:40 build36-a1 kernel: [68370.271782] usb 3-1.2: device not accepting address 112, error -110
Owner: bpastene@chromium.org
And labs has done as much as can be reasonably expected I think (thanks John), so I'll take the bug from here for any discussions, etc.

Comment 33 by jo...@google.com, May 30 2018

Thanks Ben. 

We can easily try swapping in an add-on USB card, we've seen that stabilize flaky USB for things like webcams. Shot in the dark, but it's easy to do.
Re reliability analysis in #31, only 2 out of 7 devices are alive now.
It's possible that the 2 remaining are the most reliable ones.
I'll see if I can get some per-device data.
So, I've got last 3000 tasks on each bot, an this is failure distribution:
build36-a1--device1  4
build36-a1--device2  6
build36-a1--device3 13
build36-a1--device4  5
build36-a1--device5  7
build36-a1--device6 33
build36-a1--device7 10

I've looked more closely at build36-a1--device3 and build36-a1--device6, and on each of them 6 of the failures are definitely USB failing in the beginning of the test. For other runs I'm not sure, possibly it fails in the middle of the test.
For build36-a1--device1 all 4 failures were because of USB.

So, let's say that failure probability is 5 in 3000 best case and 18 in 3000 worst case.
Between 0.16% and 0.6% chance of task failing due to USB.
We have 21 tasks per build, so that's between 3.4% and 11.9% chance of build failing.
1 in 30 builds failing will probably not alert a pixel wrangler, but 1 in 9 certainly will, which is not desirable.

Comment 36 by kbr@chromium.org, May 30 2018

Sorry, I'm not trying to make useless work for anybody, and appreciate both the Labs and general Infra teams' efforts on deploying and maintaining all of these bots. Thanks Yuly also for gathering the per-device reliability numbers and doing the analysis.

Am I misunderstanding the situation with these devices? As far as I can tell, the following failures in the last 200 builds are all due to the devices becoming unreachable over USB:

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/3256
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/3176
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/3126
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/3116
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/3115
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/3106

In counterpoint, https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28Nexus%205X%29?limit=200 has 200 green builds in a row, which is excellent.

My only desire is to be able to better rely on the test results from this bot.

Components: -Infra>Client>Android Infra>Client>Chrome
Summary: adb connection flaky during GPU tests on Android FYI Release (NVIDIA Shield TV) (was: adb connection flaky during GPU tests on Android)
I'm GPU sheriff this week. It looks like the problems with devices disappearing just before running tests are still happening on the Shield TV bot:

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29?limit=200

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/4678
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/4674
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/4659
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/4656
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/4649

The angle_perftests failure in this build seems to report it most clearly:
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/4649

...
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 2826, in _get_devices
    raise device_errors.NoDevicesError()
NoDevicesError: No devices attached.
...

The Telemetry-based tests all report:

BrowserFinderException: Cannot find browser of type android-chromium. 

Is there anything that can be done to improve the reliability of this bot? Thanks.

Owner: ynovikov@chromium.org
Per discussion with ynovikov@ during today's team meeting we're going to talk with the Skia team to see about experimenting with running these Swarming jobs on their Raspberry Pi setup. We're still seeing devices fall offline intermittently, for example:

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/5313
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Android%20FYI%20Release%20%28NVIDIA%20Shield%20TV%29/5247

Blockedon: 869557
Blockedon: 879378
https://chromium-review.googlesource.com/c/chromium/src/+/1192418 attemps running the tests on Skia devices, but that is gated on issue 879378
Blocking: 918676
Cc: -nednguyen@chromium.org

Comment 43 by benhenry@google.com, Jan 16 (6 days ago)

Components: Test>Telemetry

Comment 44 by benhenry@google.com, Jan 16 (6 days ago)

Components: -Tests>Telemetry

Sign in to add a comment