Issue metadata
Sign in to add a comment
|
9.2%-15.1% regression in blink_perf.canvas at 426024:426105 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Oct 20 2016
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8998264322439556736
,
Oct 20 2016
=== Auto-CCing suspected CL author chrisha@chromium.org === Hi chrisha@chromium.org, the bisect results pointed to your CL below as possibly causing a regression. Please have a look at this info and see whether your CL be related. ===== BISECT JOB RESULTS ===== Status: completed ===== SUSPECTED CL(s) ===== Subject : Make module enumeration a low priority background task. Author : chrisha Commit description: This makes the default mode of module enumeration one that is performed slowly and continuously in the blocking pool post startup. This addresses the CPU/battery regression it recently introduced. This also moves definitions for some preexisting privately defined histograms to the public repository. BUG=644258 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng Review-Url: https://codereview.chromium.org/2420133002 Cr-Commit-Position: refs/heads/master@{#426063} Commit : 6949a31c3ccf416a0e3423bb336e61165fcf5388 Date : Tue Oct 18 21:52:58 2016 ===== TESTED REVISIONS ===== Revision Mean Std Dev N Good? chromium@426055 18.8247 0.668965 18 good chromium@426062 18.7096 0.185573 8 good chromium@426063 18.47 0.124402 8 bad <-- chromium@426064 18.4759 0.082128 8 bad chromium@426066 18.4101 0.0878321 5 bad chromium@426069 18.4002 0.197983 18 bad Bisect job ran on: winx64ati_perf_bisect Bug ID: 657899 Test Command: .\src\out\Release_x64\performance_browser_tests.exe --test-launcher-print-test-stdio=always --enable-gpu Test Metric: TabCapturePerformance_novsync_webrtc/Capture Relative Change: 2.22% Score: 99.0 Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/winx64ati_perf_bisect/builds/1622 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8998264322439556736 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5850457409323008 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Oct 24 2016
This is very likely a legitimate regression. The CL was motivated as a fix to another regression, which was seeing increased power/CPU usage at startup. The fix instead spread out the low priority work over a number of minutes. This has presumably shifted the work to overlapping with the work performed by these benchmarks. I'm not sure what the right response is here. The work that is being done is unavoidable, and is a non-trivial amount of computation. It amounts to something that runs once per browser launch, during the first few minutes after startup. I could delay this even further, or stretch it out even more, but the work will still have to be done. Thoughts? (See bug #644258 for much more context.)
,
Oct 24 2016
+jschuh, nduca, nednguyen, charliea We have hundreds of test cases that start the browser and measure some specific type of performance metric. For the vast majority, they are meant to assume the browser has already started and we are not testing startup. So we'd really like a way to turn off things like this for the test. jschuh, nduca, overall I see things like this interrupting our perf tests (another example is bug 588745), and we can make flags to turn them off one by one but I'm pretty worried about the fact that there doesn't seem to be some sane way to measure the total effect of them, especially if we chunk off work to happen after startup so they are hidden from the startup benchmark. What should we be doing here?
,
Oct 24 2016
To #5: I don't understand your first paragraph. It seems to me what's happening here is chishsa@ push an action to be executed during startup time to after startup, hence it causes a regression in this benchmark, which is about what happen after startup.
,
Oct 25 2016
The action is a "perform this once during a browser process lifetime" type of thing. The decision was to put it after startup using PostAfterStartupTask because it would be pointless for it to slow down the startup experience. This means it runs typically some number of seconds after startup. This overlaps with the time period when the "battor.power_cases:about_blank:energy_per_frame_avg on win-high-dpi" perf test runs, and was causing an 8% regression. To solve that I split the work up into many small units (~100) that are executed in the blocking pool, spaced seconds apart. The total amount of work didn't change, but by spacing it out it doesn't cause an intense burst of CPU/disk/battery activity. I was actually considering doing that even before the perf test regression as I think it's less painful for the user overall. Spreading it out is now tickling another perf regression, and I'm not sure how to proceed. This is one of many things with the same behaviour, and I think sullivan@ is rightly questioning how we should uniformly handle jobs of this type, isolate them so they don't affect benchmarks, but also quantify them so we don't let them get out of hand.
,
Oct 26 2016
IMO we need *some* perf test that tracks how the browser performs under normal circumstances (without any flag tampering). The idea that we might, for example, start enumerating these modules twice and not have any performance tests to capture that is terrifying to me. However, I don't think that we necessarily need all or most performance tests to capture this, at least for features like this that we're reasonably sure will have an isolated impact. I wonder if the right thing to do here is to flag disable the performance-impacting feature for all performance tests except one, which has the explicit purpose of regression testing the performance impact of that feature.
,
Jan 17 2017
,
Feb 1 2017
IMO this is either a Fixed or a WontFix. Fixed in the sense that the work has now been spread out in time so as to minimize the impact at any point in time. WontFix in the sense that the work actually does need to occur, so any regression that has occurred is a known and accepted new cost. I think it makes sense to leave this running in the existing benchmarks, but to simply move the bar to reflect the new status quo. charliea@: Opinions as to how to move this forward?
,
Aug 16 2017
Marking this fixed per #10. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by pras...@chromium.org
, Oct 20 2016