New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 739899 link

Starred by 3 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug

Blocking:
issue 748145
issue 670879



Sign in to add a comment

Android test harness flakily fails to mkdir when preparing devices for tests

Project Member Reported by bpastene@chromium.org, Jul 6 2017

Issue description

Examples:
https://chromium-swarm.appspot.com/task?id=3731e0bab5df4b10
https://chromium-swarm.appspot.com/task?id=3731e05dad5e4210
https://chromium-swarm.appspot.com/task?id=37321722ada5d910
https://chromium-swarm.appspot.com/task?id=37320a4199090410

Always with a stack trace that looks like:
Traceback (most recent call last):
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 59, in wrapper
    return f(dev, *args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 52, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_gtest_run.py", line 317, in individual_device_set_up
    step()
  File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_gtest_run.py", line 293, in push_test_data
    delete_device_stale=True)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 57, in timeout_retry_wrapper
    retry_if_func=retry_if_func)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 159, in Run
    error_log_func=error_log_func)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 186, in JoinAll
    self._JoinAll(watcher, timeout)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 158, in _JoinAll
    thread.ReraiseIfException()
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 81, in run
    self._ret = self._func(*self._args, **self._kwargs)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 152, in <lambda>
    child_thread = reraiser_thread.ReraiserThread(lambda: func(*args, **kwargs),
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl
    return f(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 1344, in PushChangedFiles
    self.RunShellCommand(['mkdir', '-p'] + missing_dirs, check_return=True)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 51, in timeout_retry_wrapper
    return impl()
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl
    return f(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 1025, in RunShellCommand
    output = handle_large_output(cmd, large_output)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 994, in handle_large_output
    return handle_large_command(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 976, in handle_large_command
    return handle_check_return(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 967, in handle_check_return
    return run(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 963, in run
    return self.adb.Shell(cmd)
  File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/sdk/adb_wrapper.py", line 505, in Shell
    command, output, status=status, device_serial=self._device_serial)
AdbShellCommandFailedError: (device: 070e3ce823c2c656) shell command run via adb failed on the device:
  command: mkdir -p /storage/emulated/legacy/chromium_tests_root
  exit status: 255
  output:
  - mkdir failed for /storage/emulated/legacy/chromium_tests_root/, Device or resource busy


These types of failures become more troublesome when there's only one available device. Maybe there should be a sleep or wait-for-device in between the rm and mkdir:
https://cs.chromium.org/chromium/src/build/android/pylib/local/device/local_device_gtest_run.py?rcl=b5820db5ef26d58bcdfe1ce6dcfb77130a8f879a&l=296
 
Blocking: 670879
It can fail at a number of mkdir call sites. https://chromium-swarm.appspot.com/task?id=37320a4199090410 being another one. I'll see if I can't repro locally.
ah, fair enough.

You mentioned a wait-for-device; I wonder how sd_card_ready (https://codesearch.chromium.org/chromium/src/third_party/catapult/devil/devil/android/device_utils.py?rcl=6539cc70d9e1b8b9c67a3b7ff6a1a0da7a5a961d&l=665) would behave in this scenario.
Owner: bpastene@chromium.org
Status: Assigned (was: Available)
Found b/20096164 which sounds very similar. I might want to implement a similar work around. A few more flakes:
https://chromium-swarm.appspot.com/task?id=3737ba828026a410
https://chromium-swarm.appspot.com/task?id=3735e7a8d43ba810
https://chromium-swarm.appspot.com/task?id=3732c3089f8ccc10
https://chromium-swarm.appspot.com/task?id=3732bae760d9d310
Project Member

Comment 6 by bugdroid1@chromium.org, Jul 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/49743d172b736ffd619215c32b6eb3e2582b3b80

commit 49743d172b736ffd619215c32b6eb3e2582b3b80
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Thu Jul 13 01:41:01 2017

Roll src/third_party/catapult/ 6c40c273a..1286055c7 (8 commits)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/6c40c273a7fe..1286055c7baf

$ git log 6c40c273a..1286055c7 --date=short --no-merges --format='%ad %ae %s'
2017-07-12 adithyas Add table for Blink RuntimeCallStats
2017-07-12 bpastene devil: Add a device_utils method to mv-then-rm a device file.
2017-07-12 nednguyen Rename webpagereplay_go.py to webpagereplay_go_server.py
2017-07-12 nednguyen Add WPR go binary for Mac platform
2017-07-12 bpastene devil: When pushing data deps to device, don't mkdir redundantly.
2017-07-12 simonhatch Dashboard - Fix test-picker add button never un-greying
2017-07-12 simonhatch Dashboard - Add api_request_handler
2017-07-12 etienneb Fix symbolisation for address out-of-range

Created with:
  roll-dep src/third_party/catapult
BUG= 724543 ,739899,739899, 739783 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I114d56bf15f445060eb48951a120fd99819e0024
Reviewed-on: https://chromium-review.googlesource.com/569138
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#486207}
[modify] https://crrev.com/49743d172b736ffd619215c32b6eb3e2582b3b80/DEPS

Cc: yolandyan@chromium.org
Project Member

Comment 8 by bugdroid1@chromium.org, Jul 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/127638074ab13e80c4ac671ac57440216e141f92

commit 127638074ab13e80c4ac671ac57440216e141f92
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Wed Jul 19 00:37:40 2017

Roll src/third_party/catapult/ 91e883303..a4770ef48 (3 commits)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/91e883303891..a4770ef48621

$ git log 91e883303..a4770ef48 --date=short --no-merges --format='%ad %ae %s'
2017-07-18 simonhatch Pinpoint - Make Pinpoint's names match dashboards.
2017-07-18 bpastene devil: When moving a device path before deletion, capture cmd failures.
2017-07-18 dproy Added flags to generate js from arbitrary config for vulcanization

Created with:
  roll-dep src/third_party/catapult
BUG=739899


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I7aad7fcaeb4b481399d6e43e7e3557012b0b62b4
Reviewed-on: https://chromium-review.googlesource.com/576710
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#487692}
[modify] https://crrev.com/127638074ab13e80c4ac671ac57440216e141f92/DEPS

Blocking: 748145
Project Member

Comment 10 by bugdroid1@chromium.org, Jul 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/cd53f6cf990d519979e90ee3c6d446b5a9910d53

commit cd53f6cf990d519979e90ee3c6d446b5a9910d53
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Sat Jul 29 02:07:25 2017

Roll src/third_party/catapult/ 8b13c6e36..5f2552985 (5 commits)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/8b13c6e36285..5f255298555b

$ git log 8b13c6e36..5f2552985 --date=short --no-merges --format='%ad %ae %s'
2017-07-28 loloangela Fixed errors related to bad-continuation pt.36
2017-07-28 nednguyen cros_browser_backend.IsBrowserRunning return False immediately if cri instance is None
2017-07-28 benjhayden Draw flow events with the correct color.
2017-07-28 adithyas Update Blink RCS Total Time
2017-07-28 jbudorick [devil] Run lsof after mkdir exits with EBUSY.

Created with:
  roll-dep src/third_party/catapult
BUG=750243, 724543 ,739899


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I134f7124097572b2b25f135660ee726e29847d66
Reviewed-on: https://chromium-review.googlesource.com/592344
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#490581}
[modify] https://crrev.com/cd53f6cf990d519979e90ee3c6d446b5a9910d53/DEPS

Project Member

Comment 11 by bugdroid1@chromium.org, Aug 1 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/74e536005469d44c79e9de94beb09c039fbb826b

commit 74e536005469d44c79e9de94beb09c039fbb826b
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Tue Aug 01 21:16:14 2017

android: Rename device data dir before removing it in gtests.

Might help with device flakes like:
https://chromium-swarm.appspot.com/task?id=37320a4199090410

Bug: 739899, 748145
Change-Id: I953d60aac73194243b5ae4fd9d875c7833977a1d
Reviewed-on: https://chromium-review.googlesource.com/574872
Commit-Queue: Benjamin Pastene <bpastene@chromium.org>
Reviewed-by: Michael Case <mikecase@chromium.org>
Reviewed-by: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#491113}
[modify] https://crrev.com/74e536005469d44c79e9de94beb09c039fbb826b/build/android/pylib/local/device/local_device_gtest_run.py

Still happening:
https://chromium-swarm.appspot.com/task?id=37bcefa47007a010
https://chromium-swarm.appspot.com/task?id=37bcefbf8f53d910

This seems to be the most common source of device flake atm.
Apparently this is fixed in later versions of android: https://android.googlesource.com/platform/system/core/+/c535312

Upgrading all of our devices to a later version that includes that fix is a bit heavy-handed. Fortunately, I think the initial process that opens the handles is the swarming_bot's pre-test device cleanup:
https://chrome-internal.googlesource.com/infradata/config/+/5fdfb2cf540fd718cf715ccf510b631d5f363429/configs/chromium-swarm-dev/scripts/android.py#459

Currently that's run before and after every test, which is a bit redundant/wasteful. I'll change it to only clean up after tests, which will hopefully unblock the test runner from mkdir'ing on the sdcard all it wants.
Project Member

Comment 14 by bugdroid1@chromium.org, Aug 8 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/d2aefb6d321d8f24eba47bbaf185214d1f164c63

commit d2aefb6d321d8f24eba47bbaf185214d1f164c63
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Tue Aug 08 18:43:55 2017

Project Member

Comment 15 by bugdroid1@chromium.org, Aug 8 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/182ff74d29ba7f082d8311fc74821ee9faa2a080

commit 182ff74d29ba7f082d8311fc74821ee9faa2a080
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Tue Aug 08 21:34:46 2017

Project Member

Comment 16 by bugdroid1@chromium.org, Sep 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9d3d471e40a1d3a2017c0e39b68a26ebeb63a6ae

commit 9d3d471e40a1d3a2017c0e39b68a26ebeb63a6ae
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Mon Sep 25 22:31:42 2017

Roll src/third_party/catapult/ 2d5148d57..f7cc2170e (4 commits)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/2d5148d57e6e..f7cc2170e1ab

$ git log 2d5148d57..f7cc2170e --date=short --no-merges --format='%ad %ae %s'
2017-09-25 rnephew [Telemetry]  Let --also-run-disabled-tests override StoryExpectations.DisableBenchmark
2017-09-25 rnephew [Telemetry] Remove Enabled/Disabled decorators from benchmark.py
2017-09-25 bpastene devil: Log output when forwarder has trouble exiting.
2017-09-25 bpastene devil: Don't check exit code when grep'ing through lsof.

Created with:
  roll-dep src/third_party/catapult
BUG=713222,767956,739899


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I6b80c0705eb3457b30d9cb36480bb46a01b9644d
Reviewed-on: https://chromium-review.googlesource.com/682772
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#504190}
[modify] https://crrev.com/9d3d471e40a1d3a2017c0e39b68a26ebeb63a6ae/DEPS

Components: Infra>Client>Chrome
Labels: OS-Android
Moving Infra>Client>Android -> Infra>Client>Chrome+OS=Android
Components: -Infra>Client>Android

Sign in to add a comment