telemetry_perf_unittests failing on chromium.linux/Linux Tests (dbg)(1) |
|||||||||||||
Issue descriptionFiled by sheriff-o-matic@appspot.gserviceaccount.com on behalf of waffles@google.com telemetry_perf_unittests failing on chromium.linux/Linux Tests (dbg)(1) Builders failed on: - Linux Tests (dbg)(1): https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29
,
Oct 10
Fixed by reverting https://chromium-review.googlesource.com/c/chromium/src/+/1264657 , though I'm not sure that was really the root cause instead of just raising an existing issue to be symptomatic.
,
Oct 12
telemetry_perf_unittests is still failing. Reopen this bug. Ned, could you help here?
,
Oct 12
,
Oct 12
I suspect that this is due to adding a new system health story 'load:news:cnn:2018" which caused timed out. For now, I will remove the 'load:news:cnn' from smoke testing to reduce the load
,
Oct 12
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/649252b3e7a77b0c276e705a6a76e9139f63d624 commit 649252b3e7a77b0c276e705a6a76e9139f63d624 Author: Ned Nguyen <nednguyen@google.com> Date: Fri Oct 12 21:14:05 2018 Disable loads:news:cnn story smoke test I suspect https://chromium-review.googlesource.com/c/1264642 adding load:news:cnn:2018 has made we run out of test timing budget on CQ (16 minutes per shard), and causing failure of telemetry_perf_unittest Since 'load:news:cnn:2018' is to replace 'load:news:cnn', remove smoke testing 'load:news:cnn' to reduce test load TBR=ulan@chromium.org NOTRY=true # CQ flake Bug: 893615 Change-Id: I8488aecf97cb25ff8ec9832c79bce7b10bc93c8d Reviewed-on: https://chromium-review.googlesource.com/c/1277692 Commit-Queue: Ned Nguyen <nednguyen@google.com> Reviewed-by: Ned Nguyen <nednguyen@google.com> Cr-Commit-Position: refs/heads/master@{#599354} [modify] https://crrev.com/649252b3e7a77b0c276e705a6a76e9139f63d624/tools/perf/benchmarks/system_health_smoke_test.py
,
Oct 16
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/adf1f0297f4733ac55c72f659f3d387af471705d commit adf1f0297f4733ac55c72f659f3d387af471705d Author: Annie Sullivan <sullivan@chromium.org> Date: Tue Oct 16 01:14:15 2018 Disable smoke testing for load:tools:stackoverflow story This is deprecated and will be replaced by load:tools:stackoverflow:2018 story. Disabling now because we are hitting CQ capacity. Bug: 893615 Change-Id: I29b839a0d8bfd007f36a1d09ca1e52a4620735cf TBR: nednguyen@google.com Reviewed-on: https://chromium-review.googlesource.com/c/1280617 Commit-Queue: Annie Sullivan <sullivan@chromium.org> Reviewed-by: Annie Sullivan <sullivan@chromium.org> Cr-Commit-Position: refs/heads/master@{#599806} [modify] https://crrev.com/adf1f0297f4733ac55c72f659f3d387af471705d/tools/perf/benchmarks/system_health_smoke_test.py
,
Oct 16
Thanks Annie for #7! Seems like we're good now :D
,
Oct 18
The tests are still failing regularly: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/75137 https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/75136 https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/75135 Could you take another look?
,
Oct 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1436aab9d34916c3e985b795ab86dce8ccd4c039 commit 1436aab9d34916c3e985b795ab86dce8ccd4c039 Author: Ned Nguyen <nednguyen@google.com> Date: Fri Oct 19 00:23:18 2018 Disable smoke testing of legacy system health stories which already have 2018 version This also add check to ensure that legacy system health stories which already have a newer version must be disabled within smoke testing. Finally, this CL also shorten the names used in _DISABLED to simplify the job of disabling a system health story. Bug: 878390, 893615 Change-Id: Ife447c9e5935c960ea7e1818adfcaef0ccaeb59a Reviewed-on: https://chromium-review.googlesource.com/c/1289686 Reviewed-by: Caleb Rouleau <crouleau@chromium.org> Commit-Queue: Ned Nguyen <nednguyen@google.com> Cr-Commit-Position: refs/heads/master@{#600987} [modify] https://crrev.com/1436aab9d34916c3e985b795ab86dce8ccd4c039/tools/perf/benchmarks/system_health_smoke_test.py
,
Oct 19
There was another timeout: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/75158
,
Oct 19
The root cause seemed to be due to multitab test taking too long (448.6656s). So if the test sharding is unlucky, we end up with the a shard that takes more than 16 minutes
,
Oct 19
There are two other additional problems here: 1) Timed out is flaky, which means the logic of sharding tests is not deterministic. 2) We don't have a very optimal sharding in case one of the shard timed out
,
Oct 25
,
Oct 25
This is still happening a ton. Are y'all looking into this?
,
Oct 26
nednguyen@: Do you have a plan for getting this fixed? Should the test be disabled meanwhile?
,
Oct 26
I am out of office until Tuesday. Most of the CQ runtime increase is due to the addition of new sh stories. Annie: can you take this bug? I think the easiest is to add more bots. May want to check with John if that's ok for Linux platform.
,
Oct 26
Issue 899237 has been merged into this issue.
,
Oct 26
I think we should instead just start disabling tests. They can then wait until you're back, Ned.
,
Oct 26
Disabling system_health.memory_desktop/multitab:misc:typical24:2018 in https://chromium-review.googlesource.com/c/chromium/src/+/1301959
,
Oct 26
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/18a8e7cb1a8b459f1f684ec7d6ea87f0a9714a3f commit 18a8e7cb1a8b459f1f684ec7d6ea87f0a9714a3f Author: John Budorick <jbudorick@chromium.org> Date: Fri Oct 26 17:31:23 2018 Disable multitab system_health smoke test. 7 minute execution time is causing flaky timeouts in telemetry_perf_unittests. TBR=sullivan@chromium.org,nednguyen@chromium.org Bug: 893615 Change-Id: I73f3982453b418be59d6b7ed7067c1918c9d3d34 Reviewed-on: https://chromium-review.googlesource.com/c/1301959 Commit-Queue: John Budorick <jbudorick@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Cr-Commit-Position: refs/heads/master@{#603132} [modify] https://crrev.com/18a8e7cb1a8b459f1f684ec7d6ea87f0a9714a3f/tools/perf/benchmarks/system_health_smoke_test.py
,
Oct 26
Does disabling the story with the single 7 minute execution time fix the flakiness? If so, let's wait till Ned gets back to discuss further.
,
Oct 26
#23: tbd; https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/75296 is currently running and is the first build w/ #22.
,
Oct 26
Looks like one of the shards still timed out, though it did so after running all 26 tests and then rerunning a few of them? https://chromium-swarm.appspot.com/task?id=40ca91a912d43c10&refresh=10&show_raw=1&wide_logs=true
,
Oct 26
,
Oct 29
Still happening.
,
Oct 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/067ae99a6de8b334fba0d2d3224b1f54ee2dc3c0 commit 067ae99a6de8b334fba0d2d3224b1f54ee2dc3c0 Author: Tsuyoshi Horo <horo@chromium.org> Date: Tue Oct 30 07:56:24 2018 Disable TabStackTraceTest.testBadBreakpadFileIgnored TBR=nednguyen@chromium.org Bug: 893615 ,820282 Change-Id: Ie63980e2347a11c14855e40f75c8904420e9ffaa Reviewed-on: https://chromium-review.googlesource.com/c/1306960 Reviewed-by: Tsuyoshi Horo <horo@chromium.org> Commit-Queue: Tsuyoshi Horo <horo@chromium.org> Cr-Commit-Position: refs/heads/master@{#603827} [modify] https://crrev.com/067ae99a6de8b334fba0d2d3224b1f54ee2dc3c0/tools/perf/core/stacktrace_unittest.py
,
Oct 30
,
Jan 14
Reassigning to Caleb for triage--not sure if this is still a problem?
,
Jan 14
,
Jan 15
Looks fixed now.
,
Jan 15
|
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by waff...@chromium.org
, Oct 9