Android test harness flakily fails to mkdir when preparing devices for tests |
||||||
Issue descriptionExamples: https://chromium-swarm.appspot.com/task?id=3731e0bab5df4b10 https://chromium-swarm.appspot.com/task?id=3731e05dad5e4210 https://chromium-swarm.appspot.com/task?id=37321722ada5d910 https://chromium-swarm.appspot.com/task?id=37320a4199090410 Always with a stack trace that looks like: Traceback (most recent call last): File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_environment.py", line 59, in wrapper return f(dev, *args, **kwargs) File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 52, in traced_function return func(*args, **kwargs) File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_gtest_run.py", line 317, in individual_device_set_up step() File "/b/swarming/w/ir/build/android/pylib/local/device/local_device_gtest_run.py", line 293, in push_test_data delete_device_stale=True) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 57, in timeout_retry_wrapper retry_if_func=retry_if_func) File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 159, in Run error_log_func=error_log_func) File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 186, in JoinAll self._JoinAll(watcher, timeout) File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 158, in _JoinAll thread.ReraiseIfException() File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 81, in run self._ret = self._func(*self._args, **self._kwargs) File "/b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py", line 152, in <lambda> child_thread = reraiser_thread.ReraiserThread(lambda: func(*args, **kwargs), File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl return f(*args, **kwargs) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 1344, in PushChangedFiles self.RunShellCommand(['mkdir', '-p'] + missing_dirs, check_return=True) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 51, in timeout_retry_wrapper return impl() File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py", line 47, in impl return f(*args, **kwargs) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 1025, in RunShellCommand output = handle_large_output(cmd, large_output) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 994, in handle_large_output return handle_large_command(cmd) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 976, in handle_large_command return handle_check_return(cmd) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 967, in handle_check_return return run(cmd) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/device_utils.py", line 963, in run return self.adb.Shell(cmd) File "/b/swarming/w/ir/third_party/catapult/devil/devil/android/sdk/adb_wrapper.py", line 505, in Shell command, output, status=status, device_serial=self._device_serial) AdbShellCommandFailedError: (device: 070e3ce823c2c656) shell command run via adb failed on the device: command: mkdir -p /storage/emulated/legacy/chromium_tests_root exit status: 255 output: - mkdir failed for /storage/emulated/legacy/chromium_tests_root/, Device or resource busy These types of failures become more troublesome when there's only one available device. Maybe there should be a sleep or wait-for-device in between the rm and mkdir: https://cs.chromium.org/chromium/src/build/android/pylib/local/device/local_device_gtest_run.py?rcl=b5820db5ef26d58bcdfe1ce6dcfb77130a8f879a&l=296
,
Jul 6 2017
I'm not sure it's the rm & the mkdir; the stack trace is coming up from https://codesearch.chromium.org/chromium/src/third_party/catapult/devil/devil/android/device_utils.py?rcl=6539cc70d9e1b8b9c67a3b7ff6a1a0da7a5a961d&l=1344
,
Jul 6 2017
It can fail at a number of mkdir call sites. https://chromium-swarm.appspot.com/task?id=37320a4199090410 being another one. I'll see if I can't repro locally.
,
Jul 6 2017
ah, fair enough. You mentioned a wait-for-device; I wonder how sd_card_ready (https://codesearch.chromium.org/chromium/src/third_party/catapult/devil/devil/android/device_utils.py?rcl=6539cc70d9e1b8b9c67a3b7ff6a1a0da7a5a961d&l=665) would behave in this scenario.
,
Jul 8 2017
Found b/20096164 which sounds very similar. I might want to implement a similar work around. A few more flakes: https://chromium-swarm.appspot.com/task?id=3737ba828026a410 https://chromium-swarm.appspot.com/task?id=3735e7a8d43ba810 https://chromium-swarm.appspot.com/task?id=3732c3089f8ccc10 https://chromium-swarm.appspot.com/task?id=3732bae760d9d310
,
Jul 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/49743d172b736ffd619215c32b6eb3e2582b3b80 commit 49743d172b736ffd619215c32b6eb3e2582b3b80 Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Thu Jul 13 01:41:01 2017 Roll src/third_party/catapult/ 6c40c273a..1286055c7 (8 commits) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/6c40c273a7fe..1286055c7baf $ git log 6c40c273a..1286055c7 --date=short --no-merges --format='%ad %ae %s' 2017-07-12 adithyas Add table for Blink RuntimeCallStats 2017-07-12 bpastene devil: Add a device_utils method to mv-then-rm a device file. 2017-07-12 nednguyen Rename webpagereplay_go.py to webpagereplay_go_server.py 2017-07-12 nednguyen Add WPR go binary for Mac platform 2017-07-12 bpastene devil: When pushing data deps to device, don't mkdir redundantly. 2017-07-12 simonhatch Dashboard - Fix test-picker add button never un-greying 2017-07-12 simonhatch Dashboard - Add api_request_handler 2017-07-12 etienneb Fix symbolisation for address out-of-range Created with: roll-dep src/third_party/catapult BUG= 724543 ,739899,739899, 739783 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: I114d56bf15f445060eb48951a120fd99819e0024 Reviewed-on: https://chromium-review.googlesource.com/569138 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#486207} [modify] https://crrev.com/49743d172b736ffd619215c32b6eb3e2582b3b80/DEPS
,
Jul 14 2017
,
Jul 19 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/127638074ab13e80c4ac671ac57440216e141f92 commit 127638074ab13e80c4ac671ac57440216e141f92 Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Wed Jul 19 00:37:40 2017 Roll src/third_party/catapult/ 91e883303..a4770ef48 (3 commits) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/91e883303891..a4770ef48621 $ git log 91e883303..a4770ef48 --date=short --no-merges --format='%ad %ae %s' 2017-07-18 simonhatch Pinpoint - Make Pinpoint's names match dashboards. 2017-07-18 bpastene devil: When moving a device path before deletion, capture cmd failures. 2017-07-18 dproy Added flags to generate js from arbitrary config for vulcanization Created with: roll-dep src/third_party/catapult BUG=739899 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: I7aad7fcaeb4b481399d6e43e7e3557012b0b62b4 Reviewed-on: https://chromium-review.googlesource.com/576710 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#487692} [modify] https://crrev.com/127638074ab13e80c4ac671ac57440216e141f92/DEPS
,
Jul 24 2017
,
Jul 29 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cd53f6cf990d519979e90ee3c6d446b5a9910d53 commit cd53f6cf990d519979e90ee3c6d446b5a9910d53 Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Sat Jul 29 02:07:25 2017 Roll src/third_party/catapult/ 8b13c6e36..5f2552985 (5 commits) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/8b13c6e36285..5f255298555b $ git log 8b13c6e36..5f2552985 --date=short --no-merges --format='%ad %ae %s' 2017-07-28 loloangela Fixed errors related to bad-continuation pt.36 2017-07-28 nednguyen cros_browser_backend.IsBrowserRunning return False immediately if cri instance is None 2017-07-28 benjhayden Draw flow events with the correct color. 2017-07-28 adithyas Update Blink RCS Total Time 2017-07-28 jbudorick [devil] Run lsof after mkdir exits with EBUSY. Created with: roll-dep src/third_party/catapult BUG=750243, 724543 ,739899 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: I134f7124097572b2b25f135660ee726e29847d66 Reviewed-on: https://chromium-review.googlesource.com/592344 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#490581} [modify] https://crrev.com/cd53f6cf990d519979e90ee3c6d446b5a9910d53/DEPS
,
Aug 1 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/74e536005469d44c79e9de94beb09c039fbb826b commit 74e536005469d44c79e9de94beb09c039fbb826b Author: Benjamin Pastene <bpastene@chromium.org> Date: Tue Aug 01 21:16:14 2017 android: Rename device data dir before removing it in gtests. Might help with device flakes like: https://chromium-swarm.appspot.com/task?id=37320a4199090410 Bug: 739899, 748145 Change-Id: I953d60aac73194243b5ae4fd9d875c7833977a1d Reviewed-on: https://chromium-review.googlesource.com/574872 Commit-Queue: Benjamin Pastene <bpastene@chromium.org> Reviewed-by: Michael Case <mikecase@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Cr-Commit-Position: refs/heads/master@{#491113} [modify] https://crrev.com/74e536005469d44c79e9de94beb09c039fbb826b/build/android/pylib/local/device/local_device_gtest_run.py
,
Aug 2 2017
Still happening: https://chromium-swarm.appspot.com/task?id=37bcefa47007a010 https://chromium-swarm.appspot.com/task?id=37bcefbf8f53d910 This seems to be the most common source of device flake atm.
,
Aug 2 2017
Apparently this is fixed in later versions of android: https://android.googlesource.com/platform/system/core/+/c535312 Upgrading all of our devices to a later version that includes that fix is a bit heavy-handed. Fortunately, I think the initial process that opens the handles is the swarming_bot's pre-test device cleanup: https://chrome-internal.googlesource.com/infradata/config/+/5fdfb2cf540fd718cf715ccf510b631d5f363429/configs/chromium-swarm-dev/scripts/android.py#459 Currently that's run before and after every test, which is a bit redundant/wasteful. I'll change it to only clean up after tests, which will hopefully unblock the test runner from mkdir'ing on the sdcard all it wants.
,
Aug 8 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/config/+/d2aefb6d321d8f24eba47bbaf185214d1f164c63 commit d2aefb6d321d8f24eba47bbaf185214d1f164c63 Author: Benjamin Pastene <bpastene@chromium.org> Date: Tue Aug 08 18:43:55 2017
,
Aug 8 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/config/+/182ff74d29ba7f082d8311fc74821ee9faa2a080 commit 182ff74d29ba7f082d8311fc74821ee9faa2a080 Author: Benjamin Pastene <bpastene@chromium.org> Date: Tue Aug 08 21:34:46 2017
,
Sep 25 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9d3d471e40a1d3a2017c0e39b68a26ebeb63a6ae commit 9d3d471e40a1d3a2017c0e39b68a26ebeb63a6ae Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Mon Sep 25 22:31:42 2017 Roll src/third_party/catapult/ 2d5148d57..f7cc2170e (4 commits) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/2d5148d57e6e..f7cc2170e1ab $ git log 2d5148d57..f7cc2170e --date=short --no-merges --format='%ad %ae %s' 2017-09-25 rnephew [Telemetry] Let --also-run-disabled-tests override StoryExpectations.DisableBenchmark 2017-09-25 rnephew [Telemetry] Remove Enabled/Disabled decorators from benchmark.py 2017-09-25 bpastene devil: Log output when forwarder has trouble exiting. 2017-09-25 bpastene devil: Don't check exit code when grep'ing through lsof. Created with: roll-dep src/third_party/catapult BUG=713222,767956,739899 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: I6b80c0705eb3457b30d9cb36480bb46a01b9644d Reviewed-on: https://chromium-review.googlesource.com/682772 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#504190} [modify] https://crrev.com/9d3d471e40a1d3a2017c0e39b68a26ebeb63a6ae/DEPS
,
Dec 4 2017
Moving Infra>Client>Android -> Infra>Client>Chrome+OS=Android
,
Dec 4 2017
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by bpastene@chromium.org
, Jul 6 2017