Issue metadata
Sign in to add a comment
|
349.3% regression in tab_switching.tough_energy_cases at 454421:454537 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Mar 7 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8985783018131627456
,
Mar 7 2017
=== Auto-CCing suspected CL author dalecurtis@chromium.org === Hi dalecurtis@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : dalecurtis Commit : f4a6dbf44108f1855faf5a78d4fb06c349ebd70d Date : Fri Mar 03 00:27:26 2017 Subject: Replace FFmpegDemuxer thread per element with base::TaskScheduler. Bisect Details Configuration: mac_10_12_perf_bisect Benchmark : tab_switching.tough_energy_cases Metric : idle_wakeups_total/idle_wakeups_total Change : 213.46% | 948.833333333 -> 2974.16666667 Revision Result N chromium@454420 948.833 +- 49.5866 6 good chromium@454450 954.667 +- 48.0971 6 good chromium@454451 1009.17 +- 294.918 6 good chromium@454452 2916.5 +- 281.818 6 bad <-- chromium@454454 2963.33 +- 240.494 6 bad chromium@454458 2962.17 +- 112.164 6 bad chromium@454465 2964.83 +- 25.1959 6 bad chromium@454479 2662.33 +- 1616.82 6 bad chromium@454537 2974.17 +- 187.944 6 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests tab_switching.tough_energy_cases Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8985783018131627456 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=5883629705625600 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Mar 7 2017
,
Mar 7 2017
,
Mar 7 2017
Hmmm, for some reason the multitab story doesn't seem to be running for system health. nednguyen@, any idea why? It also doesn't seem to be running for the trace import duration metric, so this doesn't seem exclusive to power. https://chromeperf.appspot.com/report?sid=b2c43b00f191ea7204e3823c82287f6a498ec53a08c30e988ccdd6747045eb49&rev=454474
,
Mar 7 2017
I synced to dcurtis@'s CL and created a battor.tough_energy_cases benchmark locally. I then started a perf try job for that benchmark at the CL (https://codereview.chromium.org/2735633007/). I then reverted the change, uploaded a new CL, and started another perf try job for the same benchmark at that CL (https://codereview.chromium.org/2738503005). I *think* this should allow us to see BattOr results for tab_switching.tough_energy_cases. **NOTE** On further investigation, I'm not actually sure how much this is going to turn up. It looks like the tab switching tests aren't cleanly separated such that the story is in the page set and the measurements are separate: instead, the tab_switching measurements are responsible for waiting, activating tabs, etc. I'm still going to wait and see what this turns up.
,
Mar 7 2017
Hmm, this is surprising. +gab in case it's TaskScheduler related. One possibility is that this page set is getting further along than it did previously because the page is actually able to start loading sooner and actually decode some video. We've seen a couple other metrics around time to seek and playback go down because of that change, which indicates a faster startup. If the above is true I'd expect local runs on a beefy computer to not have changed much; the bots only hitting it due to slower general performance.
,
Mar 7 2017
,
Mar 7 2017
charliea@ looks like those runs completed, but I don't see a link to the results html in the buildbot listings. Can you link into the results?
,
Mar 7 2017
Trace before Dale's change: https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/trace-file-id_13-2017-03-07_09-26-50-21913.html Trace after Dale's change: https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/trace-file-id_13-2017-03-07_09-30-14-40697.html I only ran with a pageset repeat of 1, but the initial results seem to suggest that dcurtis's change LOWERED power, not raised it. I want to kick off another run with a higher repeat count to make sure that we're not seeing some aberration, though.
,
Mar 7 2017
,
Mar 7 2017
FWIW, our own internal test lab isn't showing any power improvements or regressions: https://av-analysis.corp.google.com/#/custom-page/2/VideoStack/11 charliea@ were you going to kick off another run? I don't have a battor though I may be able to borrow Caleb's.
,
Mar 8 2017
Despite idle wakeups increasing, the energy_consumption_mwh is unchanged: https://chromeperf.appspot.com/report?sid=cdea5c7187aa7830ec1a72fcca5c2608acd223202029fb0c35f94af5f133d850&rev=454474 Is there another real world metric we expect idle_wakeups to influence?
,
Mar 8 2017
Bruce, Erik: this is the case of idle_wakeups regressing but we are seeing energy_consumption unchanged.
,
Mar 8 2017
I guess it means that base::TaskScheduler is more power efficient than FFmpegDemuxer despite doing way more context switches. It would be great to understand why. I may need to try this test on Windows to see. Thanks for sharing.
,
Mar 8 2017
It looks like the cpu_utilization metric also hasn't really changed, so I'm really surprised that this didn't regress energy_consumption. Maybe we just have a very poor understanding of the relationship between idle_wakeups and power?
,
Mar 8 2017
Looks like the run from yesterday failed (due to bug 699581 ). I just kicked off two more runs (one with the change, one without) with --story-repeat=5, which should give us a little more data to work with.
,
Mar 8 2017
The usual assumption is that the energy cost of a context switch depends heavily on when it occurs. If a context switch wakes up a CPU from a deep nap then the cost is high. If the CPU was bouncing between different tasks anyway then the cost will be much lower. Avoiding context switches is probably always a good idea because their cost is variable (depending on machine load and CPU configuration, I would guess) but this is a good reminder that they are not always measurably expensive.
,
Mar 10 2017
charliea@ how did those more recent runs look? I still don't know how you got those trace URLs from the Buildbot log :)
,
Mar 27 2017
Seems like we have smoke but no fire here. Anyone mind if we just close this WontFix? |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by kraynov@chromium.org
, Mar 7 2017