New issue
Advanced search Search tips

Issue 732205 link

Starred by 6 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug



Sign in to add a comment

Multiple large suites failing on the newly dockerized android bots

Project Member Reported by jbudorick@chromium.org, Jun 12 2017

Issue description

I switched all three of the tablet bots to swarming on Friday. L and M seem okay, but K is struggling.
 
I think the Lollipop Tablet Tester and Marshmallow Tablet Tester are struggling too. It appears that the shards are timing out with "no devices attached" errors on KitKat and Lollipop. Marshmallow is a bit happier in that it's only intermittently failing with "shard #0 timed out, took too much time to complete".


Example L failure:

C   16.812s prepare_device(0a084b87)  Timed out. Dumping threads.
C   16.813s prepare_device(0a084b87)  ********************************************************************************
C   16.813s prepare_device(0a084b87)  Stack dump for thread 'TimeoutThread-2-for-prepare_device(0a084b87)'
C   16.813s prepare_device(0a084b87)  ********************************************************************************
C   16.813s prepare_device(0a084b87)  File: "/usr/lib/python2.7/threading.py", line 783, in __bootstrap
C   16.813s prepare_device(0a084b87)    self.__bootstrap_inner()
C   16.814s prepare_device(0a084b87)  File: "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
C   16.814s prepare_device(0a084b87)    self.run()
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 81, in run
C   16.814s prepare_device(0a084b87)    self._ret = self._func(*self._args, **self._kwargs)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 152, in <lambda>
C   16.814s prepare_device(0a084b87)    child_thread = reraiser_thread.ReraiserThread(lambda: func(*args, **kwargs),
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl
C   16.814s prepare_device(0a084b87)    return f(*args, **kwargs)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 640, in WaitUntilFullyBooted
C   16.814s prepare_device(0a084b87)    timeout_retry.WaitFor(pm_ready)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 100, in WaitFor
C   16.814s prepare_device(0a084b87)    result = condition()
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 624, in pm_ready
C   16.814s prepare_device(0a084b87)    return self._GetApplicationPathsInternal('android', skip_cache=True)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 540, in _GetApplicationPathsInternal
C   16.814s prepare_device(0a084b87)    ['pm', 'path', package], check_return=should_check_return)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 51, in timeout_retry_wrapper
C   16.814s prepare_device(0a084b87)    return impl()
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl
C   16.814s prepare_device(0a084b87)    return f(*args, **kwargs)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 974, in RunShellCommand
C   16.814s prepare_device(0a084b87)    output = handle_large_output(cmd, large_output)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 943, in handle_large_output
C   16.814s prepare_device(0a084b87)    return handle_large_command(cmd)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 925, in handle_large_command
C   16.814s prepare_device(0a084b87)    return handle_check_return(cmd)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 916, in handle_check_return
C   16.814s prepare_device(0a084b87)    return run(cmd)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 912, in run
C   16.814s prepare_device(0a084b87)    return self.adb.Shell(cmd)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/sdk/adb_wrapper.py", line 489, in Shell
C   16.814s prepare_device(0a084b87)    output = self._RunDeviceAdbCmd(args, timeout, retries, check_error=False)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/sdk/adb_wrapper.py", line 286, in _RunDeviceAdbCmd
C   16.814s prepare_device(0a084b87)    check_error=check_error)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 51, in timeout_retry_wrapper
C   16.814s prepare_device(0a084b87)    return impl()
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl
C   16.814s prepare_device(0a084b87)    return f(*args, **kwargs)
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/android/sdk/adb_wrapper.py", line 250, in _RunAdbCmd
C   16.814s prepare_device(0a084b87)    timeout_retry.CurrentTimeoutThreadGroup().GetRemainingTime())
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/cmd_helper.py", line 372, in GetCmdStatusAndOutputWithTimeout
C   16.814s prepare_device(0a084b87)    for data in _IterProcessStdout(process, timeout=timeout):
C   16.814s prepare_device(0a084b87)  File: "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/cmd_helper.py", line 248, in _IterProcessStdoutFcntl
C   16.814s prepare_device(0a084b87)    [child_fd], [], [], iter_aware_poll_interval)
C   16.814s prepare_device(0a084b87)  ********************************************************************************
I   16.815s TimeoutThread-3-for-prepare_device(0a084b87)  [host]> /b/swarming/w/ir/third_party/android_tools/sdk/platform-tools/adb -s 0a084b87 wait-for-device
I   16.850s TimeoutThread-3-for-prepare_device(0a084b87)  [host]> /b/swarming/w/ir/third_party/android_tools/sdk/platform-tools/adb -s 0a084b87 shell '( test -d /storage/emulated/legacy );echo %$?'
I   16.901s TimeoutThread-3-for-prepare_device(0a084b87)  condition 'sd_card_ready' met (0.1s)
I   16.901s TimeoutThread-3-for-prepare_device(0a084b87)  [host]> /b/swarming/w/ir/third_party/android_tools/sdk/platform-tools/adb -s 0a084b87 shell '( pm path android );echo %$?'
I   20.294s TimeoutThread-3-for-prepare_device(0a084b87)  condition 'pm_ready' not met (3.5s)
I   25.299s TimeoutThread-3-for-prepare_device(0a084b87)  [host]> /b/swarming/w/ir/third_party/android_tools/sdk/platform-tools/adb -s 0a084b87 shell '( pm path android );echo %$?'
I   25.769s TimeoutThread-3-for-prepare_device(0a084b87)  condition 'pm_ready' not met (9.0s)
C   25.785s prepare_device(0a084b87)  Timed out. Dumping threads.
E   25.785s prepare_device(0a084b87)  Shard timed out: prepare_device(0a084b87)
Traceback (most recent call last):
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 59, in wrapper
    return f(dev, *args, **kwargs)
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 141, in prepare_device
    d.WaitUntilFullyBooted(timeout=10)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 57, in timeout_retry_wrapper
    retry_if_func=retry_if_func)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 159, in Run
    error_log_func=error_log_func)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 186, in JoinAll
    self._JoinAll(watcher, timeout)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 158, in _JoinAll
    thread.ReraiseIfException()
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 81, in run
    self._ret = self._func(*self._args, **self._kwargs)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 152, in <lambda>
    child_thread = reraiser_thread.ReraiserThread(lambda: func(*args, **kwargs),
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl
    return f(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 640, in WaitUntilFullyBooted
    timeout_retry.WaitFor(pm_ready)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 113, in WaitFor
    msg='Timed out waiting for %r' % condition_name)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 56, in GetRemainingTime
    raise reraiser_thread.TimeoutError(msg)
CommandTimeoutError: Timed out waiting for 'pm_ready', wait of 5.0 secs required but only 1.0 secs left
C   25.786s Main  Cannot upload logcat file: file doesn't exist.
E   25.786s Main  Error occurred.
Traceback (most recent call last):
  File "/b/swarming/w/ir/build/android/test_runner.py", line 926, in main
    return RunTestsCommand(args)
  File "/b/swarming/w/ir/build/android/test_runner.py", line 684, in RunTestsCommand
    return RunTestsInPlatformMode(args)
  File "/b/swarming/w/ir/build/android/test_runner.py", line 845, in RunTestsInPlatformMode
    str(iteration_count))
  File "/b/swarming/w/ir/build/android/pylib/base/environment.py", line 33, in __exit__
    self.TearDown()
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 234, in TearDown
    self.parallel_devices.pMap(tear_down_device)
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 197, in parallel_devices
    return parallelizer.SyncParallelizer(self.devices)
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 188, in devices
    raise device_errors.NoDevicesError()
NoDevicesError: No devices attached.
Summary: Multiple suites failing on the tablet bots following the switch to single-device swarming (was: Multiple suites failing on KitKat Tablet Tester following the switch to single-device swarming)
Issue 732372 has been merged into this issue.
Project Member

Comment 4 by bugdroid1@chromium.org, Jun 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b7e12bae8572c1ed03d0959ed71df8e4efe2f8ee

commit b7e12bae8572c1ed03d0959ed71df8e4efe2f8ee
Author: John Budorick <jbudorick@chromium.org>
Date: Tue Jun 13 00:21:08 2017

[Android] Raise the timeouts of a few more tablet suites.

Bug: 732205
Change-Id: Ifa5bc36a3daaa5f6e4ad947953ce77f13c128e03
Reviewed-on: https://chromium-review.googlesource.com/532133
Reviewed-by: Michael Case <mikecase@chromium.org>
Commit-Queue: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#478840}
[modify] https://crrev.com/b7e12bae8572c1ed03d0959ed71df8e4efe2f8ee/testing/buildbot/chromium.android.json

It appears that this is still happening intermittently (unless there's a new bot issue),

e.g. this Lollipop Tablet Tester failure https://chromium-swarm.appspot.com/task?id=36bc18b5623a7710&refresh=10&show_raw=1

I  144.362s TimeoutThread-3-for-individual_device_tear_down(0a084b87)  [host]> /b/swarming/w/ir/third_party/android_tools/sdk/platform-tools/adb -s 0a084b87 shell '( am clear-debug-app );echo %$?'
E  144.826s individual_device_tear_down(0a084b87)  Shard failed: individual_device_tear_down(0a084b87)
Traceback (most recent call last):
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 59, in wrapper
    return f(dev, *args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 52, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_instrumentation_test_run.py", line 275, in individual_device_tear_down
    dev.RunShellCommand(['am', 'clear-debug-app'], check_return=True)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 57, in timeout_retry_wrapper
    retry_if_func=retry_if_func)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 159, in Run
    error_log_func=error_log_func)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 186, in JoinAll
    self._JoinAll(watcher, timeout)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 158, in _JoinAll
    thread.ReraiseIfException()
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 81, in run
    self._ret = self._func(*self._args, **self._kwargs)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 152, in <lambda>
    child_thread = reraiser_thread.ReraiserThread(lambda: func(*args, **kwargs),
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl
    return f(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 974, in RunShellCommand
    output = handle_large_output(cmd, large_output)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 943, in handle_large_output
    return handle_large_command(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 925, in handle_large_command
    return handle_check_return(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 916, in handle_check_return
    return run(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 912, in run
    return self.adb.Shell(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/sdk/adb_wrapper.py", line 505, in Shell
    command, output, status=status, device_serial=self._device_serial)
AdbShellCommandFailedError: (device: 0a084b87) shell command run via adb failed on the device:
  command: am clear-debug-app
  exit status: 1
  output:
  - Error type 2
  - android.util.AndroidException: Can't connect to activity manager; is the system running?
  - 	at com.android.commands.am.Am.onRun(Am.java:299)
  - 	at com.android.internal.os.BaseCommand.run(BaseCommand.java:47)
  - 	at com.android.commands.am.Am.main(Am.java:97)
  - 	at com.android.internal.os.RuntimeInit.nativeFinishInit(Native Method)
  - 	at com.android.internal.os.RuntimeInit.main(RuntimeInit.java:249)

C  144.839s Main  Cannot upload logcat file: file doesn't exist.
E  144.841s Main  Error occurred.
Traceback (most recent call last):
  File "/b/swarming/w/ir/build/android/test_runner.py", line 926, in main
    return RunTestsCommand(args)
  File "/b/swarming/w/ir/build/android/test_runner.py", line 684, in RunTestsCommand
    return RunTestsInPlatformMode(args)
  File "/b/swarming/w/ir/build/android/test_runner.py", line 845, in RunTestsInPlatformMode
    str(iteration_count))
  File "/b/swarming/w/ir/build/android/pylib/base/environment.py", line 33, in __exit__
    self.TearDown()
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 234, in TearDown
    self.parallel_devices.pMap(tear_down_device)
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 197, in parallel_devices
    return parallelizer.SyncParallelizer(self.devices)
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 188, in devices
    raise device_errors.NoDevicesError()
NoDevicesError: No devices attached.
#5: build43-b1--device4 appears to be in a bad state now because of that task :( https://chromium-swarm.appspot.com/bot?id=build43-b1--device4&sort_stats=total%3Adesc
Issue 735755 has been merged into this issue.
Status: Started (was: Assigned)
Summary: Multiple large suites failing on the newly dockerized android bots (was: Multiple suites failing on the tablet bots following the switch to single-device swarming)
This is affecting L Phone, as well, though at a glance it appears to present more often w/ device crashes partway through a test run.

I'm working on an initial fix that will wait for crashed devices to come back before trying additional commands on them.
Project Member

Comment 9 by bugdroid1@chromium.org, Jun 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9a43a37ef8c0021b6c44876c6b4c9332eb8c2cdc

commit 9a43a37ef8c0021b6c44876c6b4c9332eb8c2cdc
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Thu Jun 29 18:25:22 2017

Roll src/third_party/catapult/ 5de48a026..ee25754a9 (2 commits)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/5de48a026223..ee25754a916e

$ git log 5de48a026..ee25754a9 --date=short --no-merges --format='%ad %ae %s'
2017-06-29 xunjieli [wpr-go] update README to mention required packages
2017-06-29 jbudorick [devil] Add the ability to retry functions on device crash.

Created with:
  roll-dep src/third_party/catapult
BUG=732205


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: Ib714d4261f7920ac5e02e08a6dbeff26ae2a0722
Reviewed-on: https://chromium-review.googlesource.com/556074
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#483429}
[modify] https://crrev.com/9a43a37ef8c0021b6c44876c6b4c9332eb8c2cdc/DEPS

Project Member

Comment 10 by bugdroid1@chromium.org, Jun 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/60a4c98202b82c66d7418d2fc3bad1ff0d349726

commit 60a4c98202b82c66d7418d2fc3bad1ff0d349726
Author: John Budorick <jbudorick@chromium.org>
Date: Thu Jun 29 21:57:01 2017

[android] Handle device crashes in test listing & execution.

Bug: 732205
Change-Id: Ib5812fa42a6fc6478b68f3664af8507f3ea26548
Reviewed-on: https://chromium-review.googlesource.com/554378
Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org>
Commit-Queue: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#483506}
[modify] https://crrev.com/60a4c98202b82c66d7418d2fc3bad1ff0d349726/build/android/pylib/local/device/local_device_gtest_run.py
[modify] https://crrev.com/60a4c98202b82c66d7418d2fc3bad1ff0d349726/build/android/pylib/local/device/local_device_test_run.py
[modify] https://crrev.com/60a4c98202b82c66d7418d2fc3bad1ff0d349726/build/android/test_runner.pydeps

Issue 739509 has been merged into this issue.
Project Member

Comment 12 by bugdroid1@chromium.org, Jul 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/76eda5e63c8afe1f706f0537d0ae5e44ed78d877

commit 76eda5e63c8afe1f706f0537d0ae5e44ed78d877
Author: John Budorick <jbudorick@chromium.org>
Date: Fri Jul 07 04:19:51 2017

[android] Retry installation on system crash.

Bug: 732205
Change-Id: Icf02be16f360959d709f63150da44f5066b9638b
Reviewed-on: https://chromium-review.googlesource.com/562179
Reviewed-by: Michael Case <mikecase@chromium.org>
Commit-Queue: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#484829}
[modify] https://crrev.com/76eda5e63c8afe1f706f0537d0ae5e44ed78d877/build/android/pylib/local/device/local_device_gtest_run.py
[modify] https://crrev.com/76eda5e63c8afe1f706f0537d0ae5e44ed78d877/build/android/pylib/local/device/local_device_instrumentation_test_run.py

Sign in to add a comment