New issue
Advanced search Search tips

Issue 893615 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jan 15
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

telemetry_perf_unittests failing on chromium.linux/Linux Tests (dbg)(1)

Project Member Reported by sheriff-...@appspot.gserviceaccount.com, Oct 9

Issue description

Filed by sheriff-o-matic@appspot.gserviceaccount.com on behalf of waffles@google.com

telemetry_perf_unittests failing on chromium.linux/Linux Tests (dbg)(1)

Builders failed on: 
- Linux Tests (dbg)(1): 
  https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29


 
Cc: -waffles@google.com waff...@chromium.org grunell@chromium.org
Labels: -Sheriff-Chromium
Status: Fixed (was: Available)
Fixed by reverting https://chromium-review.googlesource.com/c/chromium/src/+/1264657 , though I'm not sure that was really the root cause instead of just raising an existing issue to be symptomatic.
Cc: nedngu...@google.com
Status: Available (was: Fixed)
telemetry_perf_unittests is still failing. Reopen this bug.

Ned, could you help here?
Cc: crouleau@chromium.org
Owner: nedngu...@google.com
I suspect that this is due to adding a new system health story 'load:news:cnn:2018" which caused timed out. 

For now, I will remove the 'load:news:cnn' from smoke testing to reduce the load
Project Member

Comment 6 by bugdroid1@chromium.org, Oct 12

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/649252b3e7a77b0c276e705a6a76e9139f63d624

commit 649252b3e7a77b0c276e705a6a76e9139f63d624
Author: Ned Nguyen <nednguyen@google.com>
Date: Fri Oct 12 21:14:05 2018

Disable loads:news:cnn story smoke test

I suspect https://chromium-review.googlesource.com/c/1264642 adding load:news:cnn:2018
has made we run out of test timing budget on CQ (16 minutes per shard), and
causing failure of telemetry_perf_unittest

Since 'load:news:cnn:2018' is to replace 'load:news:cnn', remove smoke testing
'load:news:cnn' to reduce test load

TBR=ulan@chromium.org
NOTRY=true # CQ flake

Bug:  893615 
Change-Id: I8488aecf97cb25ff8ec9832c79bce7b10bc93c8d
Reviewed-on: https://chromium-review.googlesource.com/c/1277692
Commit-Queue: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Cr-Commit-Position: refs/heads/master@{#599354}
[modify] https://crrev.com/649252b3e7a77b0c276e705a6a76e9139f63d624/tools/perf/benchmarks/system_health_smoke_test.py

Project Member

Comment 7 by bugdroid1@chromium.org, Oct 16

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/adf1f0297f4733ac55c72f659f3d387af471705d

commit adf1f0297f4733ac55c72f659f3d387af471705d
Author: Annie Sullivan <sullivan@chromium.org>
Date: Tue Oct 16 01:14:15 2018

Disable smoke testing for load:tools:stackoverflow story

This is deprecated and will be replaced by load:tools:stackoverflow:2018
story. Disabling now because we are hitting CQ capacity.

Bug:  893615 
Change-Id: I29b839a0d8bfd007f36a1d09ca1e52a4620735cf
TBR: nednguyen@google.com
Reviewed-on: https://chromium-review.googlesource.com/c/1280617
Commit-Queue: Annie Sullivan <sullivan@chromium.org>
Reviewed-by: Annie Sullivan <sullivan@chromium.org>
Cr-Commit-Position: refs/heads/master@{#599806}
[modify] https://crrev.com/adf1f0297f4733ac55c72f659f3d387af471705d/tools/perf/benchmarks/system_health_smoke_test.py

Status: Fixed (was: Available)
Thanks Annie for #7! Seems like we're good now :D
Project Member

Comment 11 by bugdroid1@chromium.org, Oct 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1436aab9d34916c3e985b795ab86dce8ccd4c039

commit 1436aab9d34916c3e985b795ab86dce8ccd4c039
Author: Ned Nguyen <nednguyen@google.com>
Date: Fri Oct 19 00:23:18 2018

Disable smoke testing of legacy system health stories which already have 2018 version

This also add check to ensure that legacy system health stories which already
have a newer version must be disabled within smoke testing.

Finally, this CL also shorten the names used in _DISABLED to simplify the job of
disabling a system health story.

Bug: 878390,  893615 
Change-Id: Ife447c9e5935c960ea7e1818adfcaef0ccaeb59a
Reviewed-on: https://chromium-review.googlesource.com/c/1289686
Reviewed-by: Caleb Rouleau <crouleau@chromium.org>
Commit-Queue: Ned Nguyen <nednguyen@google.com>
Cr-Commit-Position: refs/heads/master@{#600987}
[modify] https://crrev.com/1436aab9d34916c3e985b795ab86dce8ccd4c039/tools/perf/benchmarks/system_health_smoke_test.py

The root cause seemed to be due to multitab test taking too long (448.6656s). So if the test sharding is unlucky, we end up with the a shard that takes more than 16 minutes
There are two other additional problems here:
1) Timed out is flaky, which means the logic of sharding tests is not deterministic.
2) We don't have a very optimal sharding in case one of the shard timed out
Cc: danakj@chromium.org dalecur...@chromium.org
 Issue 898681  has been merged into this issue.
Components: Tests>Telemetry
Labels: Type-Bug
This is still happening a ton. Are y'all looking into this?
nednguyen@: Do you have a plan for getting this fixed? Should the test be disabled meanwhile?
Cc: jbudorick@chromium.org
Owner: sullivan@chromium.org
I am out of office until Tuesday. Most of the CQ runtime increase is due to the addition of new sh stories.

Annie: can you take this bug? I think the easiest is to add more bots. May want to check with John if that's ok for Linux platform.
Issue 899237 has been merged into this issue.
I think we should instead just start disabling tests. They can then wait until you're back, Ned.
Disabling system_health.memory_desktop/multitab:misc:typical24:2018 in https://chromium-review.googlesource.com/c/chromium/src/+/1301959
Project Member

Comment 22 by bugdroid1@chromium.org, Oct 26

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/18a8e7cb1a8b459f1f684ec7d6ea87f0a9714a3f

commit 18a8e7cb1a8b459f1f684ec7d6ea87f0a9714a3f
Author: John Budorick <jbudorick@chromium.org>
Date: Fri Oct 26 17:31:23 2018

Disable multitab system_health smoke test.

7 minute execution time is causing flaky timeouts
in telemetry_perf_unittests.

TBR=sullivan@chromium.org,nednguyen@chromium.org

Bug:  893615 
Change-Id: I73f3982453b418be59d6b7ed7067c1918c9d3d34
Reviewed-on: https://chromium-review.googlesource.com/c/1301959
Commit-Queue: John Budorick <jbudorick@chromium.org>
Reviewed-by: John Budorick <jbudorick@chromium.org>
Cr-Commit-Position: refs/heads/master@{#603132}
[modify] https://crrev.com/18a8e7cb1a8b459f1f684ec7d6ea87f0a9714a3f/tools/perf/benchmarks/system_health_smoke_test.py

Does disabling the story with the single 7 minute execution time fix the flakiness? If so, let's wait till Ned gets back to discuss further.
#23: tbd; https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests%20%28dbg%29%281%29/75296 is currently running and is the first build w/ #22.
Looks like one of the shards still timed out, though it did so after running all 26 tests and then rerunning a few of them? https://chromium-swarm.appspot.com/task?id=40ca91a912d43c10&refresh=10&show_raw=1&wide_logs=true
Cc: -danakj@chromium.org
Still happening.
Project Member

Comment 28 by bugdroid1@chromium.org, Oct 30

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/067ae99a6de8b334fba0d2d3224b1f54ee2dc3c0

commit 067ae99a6de8b334fba0d2d3224b1f54ee2dc3c0
Author: Tsuyoshi Horo <horo@chromium.org>
Date: Tue Oct 30 07:56:24 2018

Disable TabStackTraceTest.testBadBreakpadFileIgnored

TBR=nednguyen@chromium.org

Bug:  893615 ,820282
Change-Id: Ie63980e2347a11c14855e40f75c8904420e9ffaa
Reviewed-on: https://chromium-review.googlesource.com/c/1306960
Reviewed-by: Tsuyoshi Horo <horo@chromium.org>
Commit-Queue: Tsuyoshi Horo <horo@chromium.org>
Cr-Commit-Position: refs/heads/master@{#603827}
[modify] https://crrev.com/067ae99a6de8b334fba0d2d3224b1f54ee2dc3c0/tools/perf/core/stacktrace_unittest.py

Cc: -grunell@chromium.org
Owner: crouleau@chromium.org
Reassigning to Caleb for triage--not sure if this is still a problem?
Cc: -nedngu...@google.com
Looks fixed now.
Status: Fixed (was: Assigned)

Sign in to add a comment