System Health benchmark failures on different bots for different reasons |
|||||||||||||||||||||||||||||||||||||||||||||
Issue descriptionThere appear to be a lot of different things causing various system health benchmarks (mobile/desktop, common/memory) to fail on different bots. Starting with this catch-all bug trying to organize and fix things. Some of the issues that have been seen so far: - Benchmark timing out. - Data not making it to perf dashboard - Individual stories flaking - Individual stories failing - Individual stories crashing - Individual stories crashing and causing the benchmark to be aborted. - Logs are huge are very spammy, making it all very hard to diagnose. ⛆ |
|
|
,
Nov 11 2016
,
Nov 11 2016
,
Nov 11 2016
,
Nov 11 2016
,
Nov 11 2016
,
Nov 15 2016
I have a https://codereview.chromium.org/2504653002/ which cleans up some verbose logging at the end of each telemetry run by switching it from warning/critical/info to debug so that on local runs we can still enable the logging if we need more information. Its not a very thorough job of cleaning the lobs, just the low hanging fruit that probably will get the least amount of push back from removing from each telemetry run. For more logging to be removed, I think we should have a deeper discussion. I don't mind starting that discussion if no one else is already doing so or would be better suited for it.
,
Nov 15 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/67aaeb3a6e67033a8187de1e863b97ba92d85ed9 commit 67aaeb3a6e67033a8187de1e863b97ba92d85ed9 Author: catapult-deps-roller <catapult-deps-roller@chromium.org> Date: Tue Nov 15 21:58:26 2016 Roll src/third_party/catapult/ 54dd0fc86..8621fc142 (4 commits). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/54dd0fc86f01..8621fc142fbb $ git log 54dd0fc86..8621fc142 --date=short --no-merges --format='%ad %ae %s' 2016-11-15 charliea Kill per frame power metrics 2016-11-15 rnephew [Telemetry] Decrease end of test run logging. 2016-11-15 benjhayden Make valueset2html take MreResults. 2016-11-15 benjhayden Override overview line charts dataRange. BUG= 664505 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=catapult-sheriff@chromium.org Review-Url: https://codereview.chromium.org/2497893004 Cr-Commit-Position: refs/heads/master@{#432265} [modify] https://crrev.com/67aaeb3a6e67033a8187de1e863b97ba92d85ed9/DEPS
,
Nov 21 2016
,
Nov 23 2016
,
Nov 23 2016
,
Nov 24 2016
,
Nov 25 2016
,
Nov 25 2016
,
Nov 25 2016
,
Nov 25 2016
,
Nov 25 2016
,
Nov 25 2016
,
Dec 4 2016
,
Dec 6 2016
Issue 668247 has been merged into this issue.
,
Dec 6 2016
The bulk of the remaining failures appear to be from gmail pages. https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4816 [ FAILED ] 2 tests, listed below: [ FAILED ] long_running:tools:gmail-foreground [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/4511 [ FAILED ] 1 test, listed below: [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5X%20Perf%20%281%29/builds/3943 [ FAILED ] 1 test, listed below: [ FAILED ] load:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5X%20Perf%20%282%29/builds/2458 [ FAILED ] 1 test, listed below: [ FAILED ] long_running:tools:gmail-foreground https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus6%20Perf%20%281%29/builds/4519 [ FAILED ] 1 test, listed below: [ FAILED ] background:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus6%20Perf%20%282%29/builds/4393 [ FAILED ] 1 test, listed below: [ FAILED ] load:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus7v2%20Perf%20%281%29/builds/4332 [ FAILED ] 1 test, listed below: [ FAILED ] background:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus7v2%20Perf%20%282%29/builds/3445 [ FAILED ] 1 test, listed below: [ FAILED ] background:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus9%20Perf%20%281%29/builds/3798 [ FAILED ] 2 tests, listed below: [ FAILED ] background:tools:gmail [ FAILED ] load:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20One%20Perf%20%281%29/builds/4565 [ FAILED ] 1 test, listed below: [ FAILED ] background:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20One%20Perf%20%282%29/builds/5105 [ FAILED ] 1 test, listed below: [ FAILED ] background:tools:gmail This is not an exhaustive list of all the failures in system health mobile.
,
Dec 6 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ef848ecd23490de2fe546991e8a857ae7b5e8930 commit ef848ecd23490de2fe546991e8a857ae7b5e8930 Author: rnephew <rnephew@chromium.org> Date: Tue Dec 06 21:26:48 2016 [System Health][Android] Disable failing gmail tests. Tests are extremely flaky and make up a bulk of the system health mobile failures. BUG= 664505 Review-Url: https://codereview.chromium.org/2552993003 Cr-Commit-Position: refs/heads/master@{#436734} [modify] https://crrev.com/ef848ecd23490de2fe546991e8a857ae7b5e8930/tools/perf/page_sets/system_health/background_stories.py [modify] https://crrev.com/ef848ecd23490de2fe546991e8a857ae7b5e8930/tools/perf/page_sets/system_health/loading_stories.py [modify] https://crrev.com/ef848ecd23490de2fe546991e8a857ae7b5e8930/tools/perf/page_sets/system_health/long_running_stories.py
,
Dec 6 2016
I did a deeper dive into just the Nexus 5 failures and they look like this over the past 30 runs (skipping runs that failed due to infra issues): TL;DR: [ FAILED ] background:tools:gmail - 10 [ FAILED ] blank:about:blank - 1 [ FAILED ] browse:news:nytimes - 6 [ FAILED ] browse:news:qq - 3 [ FAILED ] load:tools:gmail - 9 [ FAILED ] long_running:tools:gmail-background - 8 [ FAILED ] long_running:tools:gmail-foreground - 8 https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4817 [ FAILED ] browse:news:nytimes https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4816 [ FAILED ] long_running:tools:gmail-foreground [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4815 [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4814 [ FAILED ] browse:news:nytimes https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4813 [ FAILED ] background:tools:gmail [ FAILED ] long_running:tools:gmail-foreground [ FAILED ] background:tools:gmail [ FAILED ] load:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4812 [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4811 [ FAILED ] load:tools:gmail [ FAILED ] browse:news:nytimes https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4810 [ FAILED ] long_running:tools:gmail-background [ FAILED ] browse:news:qq https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4809 [ FAILED ] load:tools:gmail [ FAILED ] browse:news:qq https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4808 [ FAILED ] long_running:tools:gmail-foreground https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4807 [ FAILED ] long_running:tools:gmail-foreground [ FAILED ] browse:news:qq https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4806 [ FAILED ] background:tools:gmail [ FAILED ] load:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4805 Passed https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4804 [ FAILED ] background:tools:gmail [ FAILED ] long_running:tools:gmail-foreground [ FAILED ] background:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4803 [ FAILED ] long_running:tools:gmail-foreground [ FAILED ] long_running:tools:gmail-foreground [ FAILED ] browse:news:nytimes https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4802 [ FAILED ] background:tools:gmail [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4801 [ FAILED ] background:tools:gmail [ FAILED ] browse:news:nytimes https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4800 [ FAILED ] background:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4799 [ FAILED ] background:tools:gmail [ FAILED ] load:tools:gmail [ FAILED ] long_running:tools:gmail-foreground https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4798 pass https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4797 [ FAILED ] load:tools:gmail [ FAILED ] blank:about:blank https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4796 [ FAILED ] load:tools:gmail [ FAILED ] long_running:tools:gmail-background [ FAILED ] browse:news:nytimes https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4794 [ FAILED ] background:tools:gmail https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4791 [ FAILED ] load:tools:gmail [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4790 [ FAILED ] load:tools:gmail [ FAILED ] long_running:tools:gmail-background https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%281%29/builds/4789 pass
,
Dec 13 2016
Did some data collection of the past 20 runs on all android platforms: https://docs.google.com/a/google.com/spreadsheets/d/1VuwbtHXD9dscFl9I3vsghuRZWjDBwmxRdt1hAWNRuFs/edit?usp=sharing
,
Dec 13 2016
After some hacking and scripting I've got the list of all stories that have failed in a bot for at least two different builds in the past 20 builds (or whatever I could grab from logdog). [A number indicates number of failures in a single build, a '-' means that the story succeeded all 3 page set repeats, and a '?' means that the story didn't run there.] ## browse:search:google win-7 33333333333333333-33 win-7-x64 333333333333333333-3 Conclusion: Failing consistently, should be disabled on windows. ## load:tools:gmail android-nexus7v2 1---1-1---1------13 android-nexus9 ---------1---1---111 android-one ------11---------11- health-plan-clankium-low-end-phone -11---11-------1-1-- health-plan-clankium-phone ---12---1-11------11 Conclusion: Very flaky, just got (actually) disabled on Android. Sadly, all the others are stories flaking on just one or two bots, but not in all others. Next step will be to dig through the logs of those failures, and try to figure out if there are any common causes. ## background:news:nytimes android-nexus5 [ref] --1--1?--1-1--11 ## background:search:google android-one [ref] ----?-11??-?--??---- ## browse:media:imgur android-nexus6 [ref] -----?-1?-11--1----- ## browse:media:youtube android-nexus7v2 1-----------1--1--- ## browse:news:flipboard android-nexus6 ?--------?--?---1--1 android-nexus6 [ref] -----?-??-??-1-1---- ## browse:news:hackernews android-one [ref] ----?---??1?-1??---- ## browse:news:nytimes android-nexus5 -1-?1--1--1-1--- android-nexus6 [ref] -----1--1----?------ ## browse:news:qq android-nexus5 -?-?--1?1-?--1-- ## browse:social:twitter android-nexus6 ---------1--1------- ## load:media:imgur android-nexus9 [ref] 1?--?----??----1---- ## load:news:bbc android-nexus6 [ref] -1--1?--?----?------ ## load:news:nytimes android-nexus9 [ref] -1-111---?--1------- ## load:news:sohu android-one [ref] ----1---11-11-11---- ## load:social:instagram android-nexus9 [ref] -?--?--11?1--------- Some of the (?) also seem to come from a test failing and the whole of the benchmark bailing out. We'll also need to figure out what is going on there.
,
Dec 13 2016
,
Dec 13 2016
Hmm. I wonder if there is a way to transform the data into a chrome trace for better visualization.
,
Dec 15 2016
Issue 666807 has been merged into this issue.
,
Dec 16 2016
,
Dec 16 2016
Issue 675002 has been merged into this issue.
,
Dec 21 2016
Just did a new run of my script, and things are starting to look a lot better. I see the following patterns: # Failures on android-nexus9 Something was probably broken on this bot, but appears to have recovered now. android-nexus9 background:news:nytimes ------------33 android-nexus9 background:search:google ------------12 android-nexus9 search:portal:google ------------33 # Flakiness on ref builds Probably expected since the ref build doesn't have the latest Chrome fixes? When is the next roll of the ref build expected to happen? android-one [ref] browse:news:hackernews 1--1?-??---? android-one [ref] load:news:sohu ---11-11---1 android-nexus6 [ref] browse:media:imgur ---1---?-1- android-nexus6 [ref] browse:news:flipboard -1----1?--1 android-nexus6 [ref] load:news:bbc --1----1--- android-nexus9 [ref] load:media:flickr -----1-1------ android-nexus9 [ref] load:news:nytimes 1---1------1-- # Flaky browse:news:nytimes on android-nexus5 And we're down to a single story on a single bot which perhaps merits further investigation. Filed issue 676315 to track this. android-nexus5 browse:news:nytimes 1--11-1111-
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
,
Dec 21 2016
I'm guessing some of Emily's group's work had something to do with bots working and being green. I'm not sure about ref build rolls, but I'm also interested.
,
Dec 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9c0d7ac829bce311c18fa420fdc97917227c0018 commit 9c0d7ac829bce311c18fa420fdc97917227c0018 Author: perezju <perezju@chromium.org> Date: Thu Dec 22 15:17:11 2016 [system health] Clean up bugs of disabled stories Some of the disable bugs currently point to "catch all" tracking bugs. Instead, make individual disabled stories point to bugs tracking their specific fix and re-enable. BUG= 664505 Review-Url: https://codereview.chromium.org/2598693002 Cr-Commit-Position: refs/heads/master@{#440422} [modify] https://crrev.com/9c0d7ac829bce311c18fa420fdc97917227c0018/tools/perf/page_sets/system_health/background_stories.py [modify] https://crrev.com/9c0d7ac829bce311c18fa420fdc97917227c0018/tools/perf/page_sets/system_health/browsing_stories.py [modify] https://crrev.com/9c0d7ac829bce311c18fa420fdc97917227c0018/tools/perf/page_sets/system_health/loading_stories.py [modify] https://crrev.com/9c0d7ac829bce311c18fa420fdc97917227c0018/tools/perf/page_sets/system_health/long_running_stories.py
,
Jan 4 2017
,
Jan 4 2017
,
Feb 6 2017
Hi. I've noticed that all media stories (play:media) have no memory dumps on mac: https://chromeperf.appspot.com/report?sid=74030f6d5104c014563cb69b347462a7d5fb6800911618ab55eceed0a1f37b4f
,
Feb 6 2017
,
Feb 6 2017
+erikchen, is there a bug for #50?
,
Feb 6 2017
I've filed issue 688995 for that one. Sorry I forgot to update here.
,
Feb 27 2017
,
Feb 28 2017
,
Apr 5 2017
,
Apr 5 2017
Wanted to give a quick update on the situation here.
test_runs failed infra missing story_runs failed missing
2017-03-29 09:12 1,000 13.2% 0.5% 6.3% 107,770 0.2% 2.9%
2017-03-30 15:34 1,000 14.2% 0.5% 5.1% 108,455 0.1% 2.9%
2017-03-31 09:55 1,000 16.3% 0.6% 4.0% 109,185 0.2% 2.8%
2017-04-03 09:17 1,000 9.0% 0.1% 1.7% 110,566 0.1% 1.1%
2017-04-04 09:38 1,000 4.4% 0.0% 1.3% 110,536 0.0% 0.6%
2017-04-05 10:18 1,000 2.9% 0.3% 1.0% 110,598 0.0% 0.8%
(Stats for latest 20 builds on all system health config/benchmarks, for non-reference builds.)
I think that, among other things, a lot of that recent drop in failures is due to the fix for issue 691654 . So, big shout out again to John for that one!
Mostly we're back to manageable levels. I took over the remaining "blocked on" bugs on Android, to work on fixing/re-enabling the remaining disabled stories.
Several of the remaining disabled stories are blocked on issue 679768 (chrome:tracing fails when the trace is too large).
And the rest of the disabled are a few scattered desktop stories.
,
Aug 4 2017
This issue was created > 6 months ago. The perf waterfall has changed significantly since then. If this bug is still relevant, please re-open.
,
Aug 4 2017
For what it's worth, I keep looking at the benchmark status (something similar to #57 above) at least once a week, filing bugs when necessary. But I haven't updated this one in particular, and I agree it's probably fine to close. Things are (somewhat) more stable now, but more work has yet to be done. Hoping that having this sort of data from flakyness dashboard will also give more visibility to problems when they occur and help us keep the waterfall healthier. |
||||||||||||||||||||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||||||||||||||||||||||
Comment 1 by nedngu...@google.com
, Nov 11 2016