Reduce false rejects of android-marshmallow-arm64-rel to 0. |
|||||
Issue descriptionIn the period from 9/23-9/25, there have only been 2 false rejects: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/android-marshmallow-arm64-rel/92575 https://ci.chromium.org/p/chromium/builders/luci.chromium.try/android-marshmallow-arm64-rel/92559 Both had the same signature: (1) test shard fails at very beginning due to "NoDevicesError: No devices attached." (2) retries disabled due to "test results TESTS FAILED; retries without patch disabled (build config changes detected)" (1) implies that the swarming bot self-quarantine script is not working as well as it could -- no tests were even run by the shard -- it just immediately detected that no devices were available and failed. (2) shows why 'retry with patch' was never run. In this case, config changes should not prevent 'retry with patch' from running, only 'retry without patch'.
,
Sep 25
Yeah, (2) seems fine. Re (1): note that swarming uses a fork of python-adb that we maintain to communicate with devices, while all of our tasks use regular ol' adb. It's possible that either one can interact with a device while the other can't, or that the handoff between the two gets fumbled on occasion.
,
Sep 25
+ maruel, bpastene maruel -- do you have context for c#2? Why do we have a fork of python-adb?
,
Sep 25
I forked python-adb a long time ago as the upstream version wasn't fault-tolerant, and I needed to support parallel channels. I tried to upstream back a bit but it was too much work for the value, so I gave up. adb (the executable) was notoriously crappy, even worse than the python implementation, so we decided to use the python driver instead a few years ago.
,
Sep 25
> "upstream version wasn't fault-tolerant" Please clarify? > I needed to support parallel channels What is this for? I'm asking because we currently use adb for the tasks themselves, and having this divergence is potentially a cause of inconsistencies in quarantining.
,
Sep 25
Upstream is https://github.com/google/python-adb
,
Sep 25
and parallel channels is the capability of running commands and streaming data simultaneously. Or running two commands simultaneously. It's an internal details.
,
Sep 25
How strongly would you feel about using regular adb to do the quarantine check from swarming get_state()?
,
Sep 25
Can you now run adb without starting a daemon? And is it possible to shut it down nicely? That was the big problem before, i.e. it would hang randomly.
,
Sep 25
> How strongly would you feel about using regular adb to do the quarantine check from swarming get_state()? I'm all for it. It's been something that I've wanted to do for ages but never had the time. Though, properly making the switch would be a tricky and non-trivial infrastructure change, so I'd like to see a doc or detailed design before it gets implemented.
,
Sep 25
To be clear, I have zero stake in this, I don't mind. Other internal adb users would need to be looped it first. I think they would appreciate too. Anyhow, this is a blocker to switch the code to Go, so that's one additional value out of this.
,
Sep 27
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/28a20f7af4857c42ab604f70ee32ce632f99d5d1 commit 28a20f7af4857c42ab604f70ee32ce632f99d5d1 Author: erikchen <erikchen@chromium.org> Date: Thu Sep 27 22:25:33 2018 Remove option for dont_deapply_patch. It's not used anywhere. Bug: 889181 Change-Id: I3ee579b2a9301914a1e8a5a4a37151054e7f17ad Reviewed-on: https://chromium-review.googlesource.com/1249449 Reviewed-by: Stephen Martinis <martiniss@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> [modify] https://crrev.com/28a20f7af4857c42ab604f70ee32ce632f99d5d1/scripts/slave/README.recipes.md [delete] https://crrev.com/2a3dc385fcf80d455667d7e92072d8bf69f4a838/scripts/slave/recipes/chromium_trybot.expected/dont_deapply_patch.json [delete] https://crrev.com/2a3dc385fcf80d455667d7e92072d8bf69f4a838/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/disable_deapply_patch_recipes.json [modify] https://crrev.com/28a20f7af4857c42ab604f70ee32ce632f99d5d1/scripts/slave/recipe_modules/chromium_tests/api.py [modify] https://crrev.com/28a20f7af4857c42ab604f70ee32ce632f99d5d1/scripts/slave/recipes/chromium_trybot.py [modify] https://crrev.com/28a20f7af4857c42ab604f70ee32ce632f99d5d1/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.py
,
Sep 28
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/24cc1abdc36f3697c0e2d0935555204a6577a036 commit 24cc1abdc36f3697c0e2d0935555204a6577a036 Author: erikchen <erikchen@chromium.org> Date: Fri Sep 28 16:02:45 2018 Allow 'retry with patch' steps for recipe config changes. Recipe config changes prevent 'retry without patch' steps from running. However, there's no reason for them to prevent 'retry with patch' steps from running. Change-Id: I2c86d5492d4e49dc2e8d92339830563418243242 Bug: 889181 Reviewed-on: https://chromium-review.googlesource.com/1249643 Commit-Queue: Erik Chen <erikchen@chromium.org> Reviewed-by: Stephen Martinis <martiniss@chromium.org> [modify] https://crrev.com/24cc1abdc36f3697c0e2d0935555204a6577a036/scripts/slave/README.recipes.md [modify] https://crrev.com/24cc1abdc36f3697c0e2d0935555204a6577a036/scripts/slave/recipe_modules/chromium_tests/api.py [modify] https://crrev.com/24cc1abdc36f3697c0e2d0935555204a6577a036/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/disable_deapply_patch_affected_files.json [modify] https://crrev.com/24cc1abdc36f3697c0e2d0935555204a6577a036/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/nonzero_exit_code_no_gtest_output.json
,
Oct 4
,
Dec 4
,
Dec 14
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by martiniss@chromium.org
, Sep 25