Issue metadata
Sign in to add a comment
|
19% regression in page_cycler_v2.basic_oopif at 419897:421741 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Oct 4 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8999731313882796352
,
Oct 5 2016
===== BISECT JOB RESULTS ===== Status: failed ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@419896 383.378 45.4649 8 good chromium@421741 445.31 3.98835 8 bad Bisect job ran on: linux_perf_bisect Bug ID: 652671 Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --also-run-disabled-tests page_cycler_v2.basic_oopif Test Metric: timeToFirstMeaningfulPaint_avg/pcv1-cold/http___www.cnn.com Relative Change: 9.73% Score: 0 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/linux_perf_bisect/builds/6742 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8999731313882796352 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5847304752332800 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Oct 11 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8999076979766099328
,
Oct 11 2016
No ref build, only one bot, very likely not a real regresssion. If the bisect results don't find anything, we should probably close this.
,
Oct 11 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8999065770792583104
,
Oct 12 2016
===== BISECT JOB RESULTS ===== Status: failed ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@419896 401.798 167.795 8 good chromium@421741 449.059 8.67676 8 bad Bisect job ran on: linux_perf_bisect Bug ID: 652671 Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests page_cycler_v2.basic_oopif Test Metric: timeToFirstMeaningfulPaint_avg/pcv1-cold/http___www.cnn.com Relative Change: 3.36% Score: 0 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/linux_perf_bisect/builds/6775 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8999065770792583104 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5308737177255936 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Oct 12 2016
===== BISECT JOB RESULTS ===== Status: failed ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@419896 388.8 90.3545 12 good chromium@421741 449.323 6.5641 18 bad Bisect job ran on: linux_perf_bisect Bug ID: 652671 Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests page_cycler_v2.basic_oopif Test Metric: timeToFirstMeaningfulPaint_avg/pcv1-cold/http___www.cnn.com Relative Change: 4.20% Score: 0 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/linux_perf_bisect/builds/6774 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8999076979766099328 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5830547115343872 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Dec 2 2016
I don't get it. The bisects are clearly showing a regression, but they won't continue to bisect. Simon - this is an old one, but maybe you can help out.
,
Dec 2 2016
Kinda looks like it actually is trying to bisect, but every revision is failing with errors like (honestly only checked a few, not all of them):
"http://www.cnn.com": {
"description": "time to first contentful paint",
"grouping_keys": {
"cache_temperature": "pcv1-cold"
},
"important": false,
"improvement_direction": "down",
"name": "timeToFirstContentfulPaint_max",
"none_value_reason": "Merging values containing a None value results in a None value. None values: [ScalarValue(http://www.cnn.com, timeToFirstContentfulPaint_max, ms, None, important=False, description=time to first contentful paint, tir_label=pcv1-cold, improvement_direction=down, grouping_keys={'cache_temperature': 'pcv1-cold'}, ScalarValue(http://www.cnn.com, timeToFirstContentfulPaint_max, ms, None, important=False, description=time to first contentful paint, tir_label=pcv1-cold, improvement_direction=down, grouping_keys={'cache_temperature': 'pcv1-cold'}]",
"page_id": 0,
"std": null,
"tir_label": "pcv1-cold",
"type": "list_of_scalar_values",
"units": "ms",
"values": null
},
Eventually the bisect times out at the 24hr mark.
The revision range was also pretty big, previous point being logged 6 days prior. So the range was 419897 - 421741. Not sure why no data was being logged during that period, looking at the log of page_cycler_v2.py didn't see any disables on linux, and poking around on crbug I wasn't able to find anything either.
The posted results are misleading since it's not mentioning all the failed builds, which needs to be fixed (crbug.com/668540).
,
Dec 6 2016
Ok, so if this is WAI, do we just need to make the benchmark better?
,
Dec 6 2016
Kouhei: any idea why some times FCP are none here? Ben: I would try to bisect on a single story instead of the average value since they are less likely to have the problem of summarization.
,
Dec 6 2016
Oops, sorry for my second comment in #12. We are already bisect only on cnn page but still have the NOne value problem because we merge the data for x number of runs on the same page.
,
Dec 6 2016
I'm still wondering how we can prevent a 24h timeout for this benchmark. Is there a way to fail earlier?
,
Dec 6 2016
Simon - there's a couple things with this bug: 1. Ned noticed that it didn't include --story-filter= in the bisect run. 2. Even if that's not the issue, and there's a very large bisect range, it would be better for bisect to say "ok, I can't do a full run on every revision in that range, but what I can do is give the user more information by running this on a couple CLs in the full range, thus decreasing their bisect range in case they want to run this again." I assume pinpoint helps with #2 - or it could just be automated with some searching algorithm instead of running every CL in the bisect range...
,
Dec 6 2016
,
Dec 7 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8993918344481041168
,
Dec 7 2016
Ok tried rekicking it with with --story-filter to see if that helps. There's a bug out to report the failed builds, which would make this a lot less confusing: crbug.com/668540 There might be something smarter we could do if after X attempts to run a cl around a revision, bail out and bisect from there instead. At least it might narrow down the range although eventually it'll still end up testing all those commits. Quick note re: #c19, I was looking at the wrong metric (max vs avg), timeToFirstMeaningfulPaint_avg doesn't seem to even show up in the chartjson for those failing urns.
,
Dec 8 2016
===== BISECT JOB RESULTS ===== Status: failed ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@419896 380.974 296.329 27 good chromium@420784 N/A N/A 0 unknown chromium@420785 N/A N/A 0 unknown chromium@420786 N/A N/A 0 unknown chromium@420787 N/A N/A 0 unknown chromium@420788 N/A N/A 0 unknown chromium@420789 N/A N/A 0 unknown chromium@420790 N/A N/A 0 unknown chromium@420791 N/A N/A 0 unknown chromium@420792 N/A N/A 0 unknown chromium@420793 N/A N/A 0 unknown chromium@420794 N/A N/A 0 unknown chromium@420795 N/A N/A 0 unknown chromium@420796 N/A N/A 0 unknown chromium@420797 N/A N/A 0 unknown chromium@420798 N/A N/A 0 unknown chromium@420799 N/A N/A 0 unknown chromium@420800 N/A N/A 0 unknown chromium@420801 N/A N/A 0 unknown chromium@420802 N/A N/A 0 unknown chromium@420803 N/A N/A 0 unknown chromium@420804 N/A N/A 0 unknown chromium@420805 N/A N/A 0 unknown chromium@420806 N/A N/A 0 unknown chromium@420807 N/A N/A 0 unknown chromium@420808 N/A N/A 0 unknown chromium@420809 N/A N/A 0 unknown chromium@420810 N/A N/A 0 unknown chromium@420811 N/A N/A 0 unknown chromium@420812 N/A N/A 0 unknown chromium@420813 N/A N/A 0 unknown chromium@420814 N/A N/A 0 unknown chromium@420815 N/A N/A 0 unknown chromium@420816 N/A N/A 0 unknown chromium@420817 N/A N/A 0 unknown chromium@420818 N/A N/A 0 unknown chromium@420819 N/A N/A 0 unknown chromium@420820 N/A N/A 0 unknown chromium@420821 N/A N/A 0 unknown chromium@420822 N/A N/A 0 unknown chromium@420823 N/A N/A 0 unknown chromium@420824 N/A N/A 0 unknown chromium@420825 N/A N/A 0 unknown chromium@420826 N/A N/A 0 unknown chromium@420827 N/A N/A 0 unknown chromium@420828 N/A N/A 0 unknown chromium@420829 N/A N/A 0 unknown chromium@420830 N/A N/A 0 unknown chromium@420831 N/A N/A 0 unknown chromium@420832 N/A N/A 0 unknown chromium@420833 N/A N/A 0 unknown chromium@420834 N/A N/A 0 unknown chromium@420835 N/A N/A 0 unknown chromium@420836 N/A N/A 0 unknown chromium@420837 N/A N/A 0 unknown chromium@420838 N/A N/A 0 unknown chromium@420839 N/A N/A 0 unknown chromium@420840 N/A N/A 0 unknown chromium@420841 N/A N/A 0 unknown chromium@420842 N/A N/A 0 unknown chromium@420843 N/A N/A 0 unknown chromium@420844 N/A N/A 0 unknown chromium@420845 N/A N/A 0 unknown chromium@420846 N/A N/A 0 unknown chromium@420847 N/A N/A 0 unknown chromium@420848 N/A N/A 0 unknown chromium@420849 N/A N/A 0 unknown chromium@420850 N/A N/A 0 unknown chromium@420851 N/A N/A 0 unknown chromium@420852 N/A N/A 0 unknown chromium@420853 N/A N/A 0 unknown chromium@420854 N/A N/A 0 unknown chromium@421741 446.817 27.7214 18 bad Bisect job ran on: linux_perf_bisect Bug ID: 652671 Test Command: src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http...www.cnn.com page_cycler_v2.basic_oopif Test Metric: timeToFirstMeaningfulPaint_avg/pcv1-cold/http___www.cnn.com Relative Change: 17.28% Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/linux_perf_bisect/builds/6906 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8993918344481041168 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=6438380873711616 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Apr 11 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8982620275475757424
,
Aug 16 2017
Looks like we're not going to track this down. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by tdres...@chromium.org
, Oct 4 2016