New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 657899 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

9.2%-15.1% regression in blink_perf.canvas at 426024:426105

Project Member Reported by pras...@chromium.org, Oct 20 2016

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=657899

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICg7dPAvQoM,agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICg7eX5tAkM,agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICgnfbYsQkM


Bot(s) for this bug's original alert(s):

chromium-rel-win7-gpu-ati
chromium-rel-win7-gpu-nvidia
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Oct 20 2016

Cc: chrisha@chromium.org
Owner: chrisha@chromium.org

=== Auto-CCing suspected CL author chrisha@chromium.org ===

Hi chrisha@chromium.org, the bisect results pointed to your CL below as possibly
causing a regression. Please have a look at this info and see whether
your CL be related.


===== BISECT JOB RESULTS =====
Status: completed


===== SUSPECTED CL(s) =====
Subject : Make module enumeration a low priority background task.
Author  : chrisha
Commit description:
  
This makes the default mode of module enumeration one that is performed slowly and continuously in the blocking pool post startup. This addresses the CPU/battery regression it recently introduced.

This also moves definitions for some preexisting privately defined histograms to the public repository.

BUG=644258
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng

Review-Url: https://codereview.chromium.org/2420133002
Cr-Commit-Position: refs/heads/master@{#426063}
Commit  : 6949a31c3ccf416a0e3423bb336e61165fcf5388
Date    : Tue Oct 18 21:52:58 2016


===== TESTED REVISIONS =====
Revision         Mean     Std Dev    N   Good?
chromium@426055  18.8247  0.668965   18  good
chromium@426062  18.7096  0.185573   8   good
chromium@426063  18.47    0.124402   8   bad    <--
chromium@426064  18.4759  0.082128   8   bad
chromium@426066  18.4101  0.0878321  5   bad
chromium@426069  18.4002  0.197983   18  bad

Bisect job ran on: winx64ati_perf_bisect
Bug ID: 657899

Test Command: .\src\out\Release_x64\performance_browser_tests.exe --test-launcher-print-test-stdio=always --enable-gpu
Test Metric: TabCapturePerformance_novsync_webrtc/Capture
Relative Change: 2.22%
Score: 99.0

Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/winx64ati_perf_bisect/builds/1622
Job details: https://chromeperf.appspot.com/buildbucket_job_status/8998264322439556736


Not what you expected? We'll investigate and get back to you!
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5850457409323008

| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
This is very likely a legitimate regression. The CL was motivated as a fix to another regression, which was seeing increased power/CPU usage at startup. The fix instead spread out the low priority work over a number of minutes. This has presumably shifted the work to overlapping with the work performed by these benchmarks.

I'm not sure what the right response is here. The work that is being done is unavoidable, and is a non-trivial amount of computation. It amounts to something that runs once per browser launch, during the first few minutes after startup. I could delay this even further, or stretch it out even more, but the work will still have to be done.

Thoughts? (See bug #644258 for much more context.)
Cc: nduca@chromium.org jsc...@chromium.org nednguyen@chromium.org charliea@chromium.org
+jschuh, nduca, nednguyen, charliea

We have hundreds of test cases that start the browser and measure some specific type of performance metric. For the vast majority, they are meant to assume the browser has already started and we are not testing startup. So we'd really like a way to turn off things like this for the test.

jschuh, nduca, overall I see things like this interrupting our perf tests (another example is bug 588745), and we can make flags to turn them off one by one but I'm pretty worried about the fact that there doesn't seem to be some sane way to measure the total effect of them, especially if we chunk off work to happen after startup so they are hidden from the startup benchmark. What should we be doing here?
To #5: I don't understand your first paragraph. It seems to me what's happening here is chishsa@ push an action to be executed during startup time to after startup, hence it causes a regression in this benchmark, which is about what happen after startup.
The action is a "perform this once during a browser process lifetime" type of thing. The decision was to put it after startup using PostAfterStartupTask because it would be pointless for it to slow down the startup experience.

This means it runs typically some number of seconds after startup. This overlaps with the time period when the "battor.power_cases:about_blank:energy_per_frame_avg on win-high-dpi" perf test runs, and was causing an 8% regression. To solve that I split the work up into many small units (~100) that are executed in the blocking pool, spaced seconds apart. The total amount of work didn't change, but by spacing it out it doesn't cause an intense burst of CPU/disk/battery activity. I was actually considering doing that even before the perf test regression as I think it's less painful for the user overall.

Spreading it out is now tickling another perf regression, and I'm not sure how to proceed. This is one of many things with the same behaviour, and I think sullivan@ is rightly questioning how we should uniformly handle jobs of this type, isolate them so they don't affect benchmarks, but also quantify them so we don't let them get out of hand.
IMO we need *some* perf test that tracks how the browser performs under normal circumstances (without any flag tampering). The idea that we might, for example, start enumerating these modules twice and not have any performance tests to capture that is terrifying to me.

However, I don't think that we necessarily need all or most performance tests to capture this, at least for features like this that we're reasonably sure will have an isolated impact.

I wonder if the right thing to do here is to flag disable the performance-impacting feature for all performance tests except one, which has the explicit purpose of regression testing the performance impact of that feature. 
Status: Assigned (was: Untriaged)
IMO this is either a Fixed or a WontFix. Fixed in the sense that the work has now been spread out in time so as to minimize the impact at any point in time. WontFix in the sense that the work actually does need to occur, so any regression that has occurred is a known and accepted new cost.

I think it makes sense to leave this running in the existing benchmarks, but to simply move the bar to reflect the new status quo. charliea@: Opinions as to how to move this forward?
Status: Fixed (was: Assigned)
Marking this fixed per #10.

Sign in to add a comment