Android Tests bot occasionally fails telemetry_perf_unittests BattOrCases |
|||||||||||||||||
Issue descriptionhttps://uberchromegw.corp.google.com/i/chromium.linux/builders/Android%20Tests/builds/27146/steps/telemetry_perf_unittests%20on%20Android/logs/stdio [27/177] benchmarks.benchmark_smoke_unittest.BenchmarkSmokeTest.BattOr.BattOrCases failed unexpectedly 8.4077s
,
Jun 8 2016
,
Jun 8 2016
in the log: seeing both 'device not found' and exceptions on Timeout threads, however that happens for other tests as well, not only for the 'BattOrCases', the latter is listed as failing in the end though. AdbCommandFailedError: (device: 06074a7c006236b6) adb shell '( pm path org.chromium.chrome );echo %$?': failed with exit status 255 and output: - error: device '06074a7c006236b6' not found [26/177] benchmarks.benchmark_smoke_unittest.BenchmarkSmokeTest.page_cycler.typical_25 queued [27/177] benchmarks.benchmark_smoke_unittest.BenchmarkSmokeTest.BattOr.BattOrCases failed unexpectedly 8.4077s:
,
Jun 8 2016
looks like this has been added recently (a week or so ago): https://codereview.chromium.org/1984473002/ going to disable on android for now, adding the author of the above patch +cc: rnephew@chromium.org
,
Jun 8 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e7e948c11a4f4d927ce6c92a5205d6dbd4660345 commit e7e948c11a4f4d927ce6c92a5205d6dbd4660345 Author: timvolodine <timvolodine@chromium.org> Date: Wed Jun 08 16:08:39 2016 [Android] Disable flaky BattOrCases telemetry_perf_unittest Flakes on Android Tests bot. https://uberchromegw.corp.google.com/i/chromium.linux/builders/Android%20Tests BUG= 618330 CQ_EXTRA_TRYBOTS=tryserver.chromium.perf:android_s5_perf_cq;tryserver.chromium.perf:winx64_10_perf_cq;tryserver.chromium.perf:mac_retina_perf_cq NOTRY=true TBR=nednguyen@google.com,boliu@chromium.org Review-Url: https://codereview.chromium.org/2049113002 Cr-Commit-Position: refs/heads/master@{#398583} [modify] https://crrev.com/e7e948c11a4f4d927ce6c92a5205d6dbd4660345/tools/perf/benchmarks/battor.py
,
Jun 10 2016
cc-ing stip since we recently changed telemetry_perf_unittests.
,
Jun 10 2016
I'd guess this is probably not due to the swarming transition.
,
Jun 10 2016
Im wondering if it has something to do with this: WARNING:root:Dependency battor_agent_binary not configured for platform linux_x86_64. Skipping prefetch. But I would expect it to fail much differently than it is manifesting here in this case; either way this needs to be fixed but I dont expect it to be the route cause. I wonder if the real cause of the failure is being swallowed somewhere though.
,
Jun 10 2016
Sorry - just saw this. This is definitely due to some failure in the power-infra code, not the swarming transition.
We have some code that detects whether a power monitoring device ("BattOr") is present, and only enables our benchmark if it is. It looks like this code is itself crashing, causing problems.
,
Jun 10 2016
Assigning this to rnephew@, who's agreed to look into what we could be doing better here
,
Jun 10 2016
,
Jun 10 2016
Anyone have any input on why it would not be retrying this test? (Deleting unimportant lines) Retrying failed tests (attempt #1 of 3)... ... [1/22] benchmarks.benchmark_smoke_unittest.BenchmarkSmokeTest.BattOr.BattOrCases was skipped 2.8147s ... All 21 other tests retry. The cause of this issue is that telemetry is getting a device not responding error before it gets to the point where it determines if a battor is present and should continue with the run. Then no retries happen because in later attempts it detects that the test should be skipped. The problem is not with the battor benchmark itself, as the device offline error occurs during the FindBrowser step of a telemetry run. The fix here is either to change how it is determined when this test runs (it is currently determined here for the battor test): https://cs.chromium.org/chromium/src/tools/perf/benchmarks/battor.py?dr=CSs&q=ShouldDisable+battor&sq=package:chromium&l=22 or make changes to how failures are detected to detect this edge case. Adding Ned on guidance for option one.
,
Jun 10 2016
Ignore the first line of the last comment, I was writing it while I investigated and didn't clean up that line. I know why its not retrying, as the rest of the comment says.
,
Jun 13 2016
Thanks for your analysis, Randy. The unittest benchmark_smoke_unittest.BenchmarkSmokeTest.BattOr.BattOrCases though is generated dynamically through https://cs.chromium.org/chromium/src/tools/perf/benchmarks/benchmark_smoke_unittest.py?rcl=0&l=76 So this seems like a problem with typ does not cache the test case that was dynamically generated?
,
Jun 20 2016
/bump Is this still happening? Any update on where we're at in fixing it?
,
Jun 20 2016
It was disabled, and I found out WHY it was being flaky. I haven't done any work on fixing it because I am not sure that is the correct fix for it.
,
Jul 27 2016
We are bringing the android fyi bot online now, and it would be nice if we could re-enable these tests on android. Ned said the problem was with typ not caching the test cases; could someone with more knowledge on typ take this over? If no one is available that is familiar with typ I can take it back and work on it; it would just probably be faster if someone familiar took a look.
,
Jul 28 2016
I can try to at least take a quick look at this tomorrow, and I should definitely have some time to look at this next week if need be.
,
Jul 28 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3bfff97f65f489046098d9b4fd60ecb99c7faa11 commit 3bfff97f65f489046098d9b4fd60ecb99c7faa11 Author: rnephew <rnephew@chromium.org> Date: Thu Jul 28 17:32:06 2016 [Android] Reenable BattOr tests for android. BUG=624962, 618330 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.perf:android_s5_perf_cq;master.tryserver.chromium.perf:linux_perf_cq;master.tryserver.chromium.perf:mac_retina_perf_cq;master.tryserver.chromium.perf:winx64_10_perf_cq Review-Url: https://codereview.chromium.org/2187183002 Cr-Commit-Position: refs/heads/master@{#408421} [modify] https://crrev.com/3bfff97f65f489046098d9b4fd60ecb99c7faa11/tools/perf/benchmarks/battor.py [modify] https://crrev.com/3bfff97f65f489046098d9b4fd60ecb99c7faa11/tools/perf/benchmarks/benchmark_smoke_unittest.py
,
Jul 31 2016
Okay, I looked over this bug again, and I'm now realizing I don't actually know what is supposedly wrong with typ. @nednguyen: why did you say that "this seems like a problem with typ does not cache the test case that was dynamically generated"? From me looking at the code now, the "was skipped" result shows up when typ invokes a test, and it calls the TestCase.skip() method. So, it sounds like we retried the test, but the test itself decided it should be skipped. What would you expect to have happened?
,
Jul 31 2016
Hi Dirk, The test benchmarks.benchmark_smoke_unittest.BenchmarkSmokeTest.BattOr.BattOrCase is supposed to be skipped when device does not has battor (See https://cs.chromium.org/chromium/src/tools/perf/benchmarks/battor.py?rcl=0&l=27 & https://cs.chromium.org/chromium/src/tools/perf/benchmarks/benchmark_smoke_unittest.py?rcl=0&l=75) According to Randy's comment in #12, the test BattOr.BattOrCase is first run (which probable means the logic of determining whether the device has battor may be flaky), but failed. Then when typ retried the test, the test is skipped. So to summarize: BattOr.BattOrCases run --> Failed BattOr.BattOrCases run again in retries --> Skipped All other failed test passed when retried. Typ consider the whole suite as failed. Based on this, the failure is caused by 2 bugs: 1) The check whether the device has battor device attached is flaky --> lead to flaky "skip vs not skip" state of the test case. 2) Even though the test case already skipped in retries, Typ still determines the test suite as failed. I think the fix here include both 1) & 2)
,
Jul 31 2016
Ah, I see. So, the test failed once, and never actually passed. Okay, let me think about how to handle this.
,
Jul 31 2016
Okay, after having thought about this for a while, it's not actually obvious to me that typ is doing the wrong thing. After all, the test did fail once, and never actually passed. However, I don't really feel that strongly about this, and there is a sense in which I can see how we should treat "failed then was skipped" like "failed then passed" and so in the sense of "the test didn't always fail" we might consider it to pass. So, I've posted a CL that changes things to accomodate that: https://codereview.chromium.org/2201663002/ It still feels a little weird to me, so I'm not sure I want to land it. Ned, what do you think?
,
Aug 1 2016
If I recall correctly, it is failing _before_ it gets to the part of code that detects a battor because of device flakiness. At least thats what I wrote in comment #12.
,
Aug 2 2016
Dirk: I think the interpretation "failed then was skipped" yields similar outcome to "failed then passed" is the elegant way to explain this decision. I think the weirdness come from the fact that the skip signal is issued dynamically during the test time instead of before test run time. #24: Sorry for missing this. But this shows that 1) in #21 is still a problem that need to be fixed, whereas 2) is probably not a problem. This makes it even a stronger case for making the fix in #23. Assign to Dirk since he's initiated the fix.
,
Aug 2 2016
to be clear, my fix addresses 2), not 1).
,
Aug 2 2016
Sorry for the mixed up, dpranke's fix addresses 2, and not 1) in #21. Also 1) is an incorrect assessment. The test first failed because of the test android device failure, not because the battor discovering logic is flaky.
,
Aug 30 2016
,
Aug 30 2016
,
Aug 30 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/71c9ade4e91e16846779311f1ae800ed550c8b69 commit 71c9ade4e91e16846779311f1ae800ed550c8b69 Author: dpranke <dpranke@chromium.org> Date: Tue Aug 30 19:25:32 2016 Roll typ to v0.9.7. This picks up a change that will help address flaky telemetry failures. This includes the following changes (v0.9.5..v0.9.7): 79fe79db Change typ's interpretation of a test that first fails and is then skipped. 101acd31 Bump version to 0.9.6, clean up Python 3 failures 2f8787b8 rework run script to remove python3 log d5023636 rework sharding tests TBR=rnephew@chromium.org, nednguyen@google.com BUG= 618330 Review-Url: https://codereview.chromium.org/2289303002 Cr-Commit-Position: refs/heads/master@{#415392} [modify] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/README.chromium [add] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/codereview.settings [modify] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/run [modify] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/typ/json_results.py [modify] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/typ/runner.py [modify] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/typ/tests/main_test.py [modify] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/typ/tests/pool_test.py [modify] https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69/third_party/typ/typ/version.py
,
Aug 30 2016
Okay, the typ fix has landed, so I'm punting this back to you telemetry folks :).
,
Aug 30 2016
I'm not sure there is anything to do from the telemetry side now. We have the test blacklisted from running as part of a smoke test; and there has been active discussion on those taking too long for the CQ.
,
Aug 31 2016
Dirk's fix should be enough if we re-enable the benchmark.
,
Sep 6 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1023620c13385de9fcea55ec35f66dc22b454ff8 commit 1023620c13385de9fcea55ec35f66dc22b454ff8 Author: dpranke <dpranke@chromium.org> Date: Tue Sep 06 01:51:33 2016 Revert of Roll typ to v0.9.7. (patchset #1 id:1 of https://codereview.chromium.org/2289303002/ ) Reason for revert: Let's see if reverting this helps w/ crbug.com/643320. Original issue's description: > Roll typ to v0.9.7. > > This picks up a change that will help address flaky telemetry failures. > > This includes the following changes (v0.9.5..v0.9.7): > > 79fe79db Change typ's interpretation of a test that first fails > and is then skipped. > 101acd31 Bump version to 0.9.6, clean up Python 3 failures > 2f8787b8 rework run script to remove python3 log > d5023636 rework sharding tests > > TBR=rnephew@chromium.org, nednguyen@google.com > BUG= 618330 > > Committed: https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69 > Cr-Commit-Position: refs/heads/master@{#415392} TBR=nednguyen@google.com,rnephew@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 618330 , 643320 Review-Url: https://codereview.chromium.org/2317473002 Cr-Commit-Position: refs/heads/master@{#416573} [modify] https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8/third_party/typ/README.chromium [delete] https://crrev.com/8211426fca927c616b819505cc564a2de12c7715/third_party/typ/codereview.settings [modify] https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8/third_party/typ/run [modify] https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8/third_party/typ/typ/json_results.py [modify] https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8/third_party/typ/typ/runner.py [modify] https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8/third_party/typ/typ/tests/main_test.py [modify] https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8/third_party/typ/typ/tests/pool_test.py [modify] https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8/third_party/typ/typ/version.py
,
Sep 8 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/00550736d827b1637ee38757d40f46a382953c79 commit 00550736d827b1637ee38757d40f46a382953c79 Author: nednguyen <nednguyen@google.com> Date: Thu Sep 08 22:09:01 2016 Reland of ll typ to v0.9.7. (patchset #1 id:1 of https://codereview.chromium.org/2317473002/ ) Reason for revert: This is probably not the root cause of crbug.com/643320 Original issue's description: > Revert of Roll typ to v0.9.7. (patchset #1 id:1 of https://codereview.chromium.org/2289303002/ ) > > Reason for revert: > Let's see if reverting this helps w/ crbug.com/643320. > > Original issue's description: > > Roll typ to v0.9.7. > > > > This picks up a change that will help address flaky telemetry failures. > > > > This includes the following changes (v0.9.5..v0.9.7): > > > > 79fe79db Change typ's interpretation of a test that first fails > > and is then skipped. > > 101acd31 Bump version to 0.9.6, clean up Python 3 failures > > 2f8787b8 rework run script to remove python3 log > > d5023636 rework sharding tests > > > > TBR=rnephew@chromium.org, nednguyen@google.com > > BUG= 618330 > > > > Committed: https://crrev.com/71c9ade4e91e16846779311f1ae800ed550c8b69 > > Cr-Commit-Position: refs/heads/master@{#415392} > > TBR=nednguyen@google.com,rnephew@chromium.org > # Not skipping CQ checks because original CL landed more than 1 days ago. > BUG= 618330 , 643320 > > Committed: https://crrev.com/1023620c13385de9fcea55ec35f66dc22b454ff8 > Cr-Commit-Position: refs/heads/master@{#416573} TBR=rnephew@chromium.org,nednguyen@chromium.org,dpranke@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 618330 , 643320 Review-Url: https://codereview.chromium.org/2328753002 Cr-Commit-Position: refs/heads/master@{#417415} [modify] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/README.chromium [add] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/codereview.settings [modify] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/run [modify] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/typ/json_results.py [modify] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/typ/runner.py [modify] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/typ/tests/main_test.py [modify] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/typ/tests/pool_test.py [modify] https://crrev.com/00550736d827b1637ee38757d40f46a382953c79/third_party/typ/typ/version.py |
|||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||
Comment 1 by timvolod...@chromium.org
, Jun 8 2016