New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

5.7%-384.8% regression in media.desktop at 548411:548617

Project Member Reported by chcunningham@chromium.org, Apr 9 2018

Issue description

Lots of CPU time alerts for windows bots.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=830825

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=8b4983e21b362d17eb8d0ab4318775a61515db432e59836b54cfb8fd5fc95aa1


Bot(s) for this bug's original alert(s):

chromium-rel-win10
chromium-rel-win7-dual
chromium-rel-win7-gpu-ati
chromium-rel-win7-gpu-intel
chromium-rel-win7-gpu-nvidia
chromium-rel-win7-x64-dual
chromium-rel-win8-dual
Kicked off a few new bisects. The ref graphs held steady for these and the regression looks to be sustained and significant. Pinpoint should have no trouble...
Project Member

Comment 8 by 42576172...@developer.gserviceaccount.com, Apr 10 2018

Cc: pmonette@chromium.org rkaplow@chromium.org taku...@chromium.org
Owner: pmonette@chromium.org
Status: Assigned (was: Untriaged)
📍 Found significant differences after each of 2 commits.
https://pinpoint-dot-chromeperf.appspot.com/job/16de2f0cc40000

Disable test doubletap-to-jump-forwards-too-short on Win7 by takumif@chromium.org
https://chromium.googlesource.com/chromium/src/+/9e405e24c682904c03fcf620c388fb004b60ec0b

Adding fieldtrial config for the IncompatibleApplicationsWarning feature by pmonette@chromium.org
https://chromium.googlesource.com/chromium/src/+/7a86e22e588d7941bde54bdb4a8c58c23e502836

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Project Member

Comment 9 by 42576172...@developer.gserviceaccount.com, Apr 11 2018

Cc: bajones@chromium.org holte@chromium.org vollick@chromium.org mbarbe...@chromium.org olivierrobin@chromium.org klausw@chromium.org engedy@chromium.org agl@chromium.org billorr@chromium.org
Owner: klausw@chromium.org
📍 Found significant differences after each of 5 commits.
https://pinpoint-dot-chromeperf.appspot.com/job/14bbeb02c40000

webauthn: add test for oversized credential IDs. by agl@chromium.org
https://chromium.googlesource.com/chromium/src/+/ce1605967091027c08692ee570366d8b808a72fb

Adding fieldtrial config for the IncompatibleApplicationsWarning feature by pmonette@chromium.org
https://chromium.googlesource.com/chromium/src/+/7a86e22e588d7941bde54bdb4a8c58c23e502836

Make --force-enable-metrics to whitelist all entries (for local debugging) by rkaplow@chromium.org
https://chromium.googlesource.com/chromium/src/+/8b44ca49e25358bfb6a562706c6cf3bbaf56a52e

Add SubmitFrameMissing mojo call for WebVR/WebXR by klausw@chromium.org
https://chromium.googlesource.com/chromium/src/+/f5bbb138786c2236059c0fa28eaecf1e9f1d4a9e

Revert "Add SubmitFrameMissing mojo call for WebVR/WebXR" by klausw@chromium.org
https://chromium.googlesource.com/chromium/src/+/4029d625bb7290835ef25c56b12275f6f2294bcf

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Project Member

Comment 10 by 42576172...@developer.gserviceaccount.com, Apr 11 2018

Owner: pmonette@chromium.org
📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/14b1768cc40000

Adding fieldtrial config for the IncompatibleApplicationsWarning feature by pmonette@chromium.org
https://chromium.googlesource.com/chromium/src/+/7a86e22e588d7941bde54bdb4a8c58c23e502836

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Project Member

Comment 11 by 42576172...@developer.gserviceaccount.com, Apr 11 2018

Cc: chrishtr@chromium.org kenrb@chromium.org
Owner: klausw@chromium.org
📍 Found significant differences after each of 4 commits.
https://pinpoint-dot-chromeperf.appspot.com/job/13e37d94c40000

Adding fieldtrial config for the IncompatibleApplicationsWarning feature by pmonette@chromium.org
https://chromium.googlesource.com/chromium/src/+/7a86e22e588d7941bde54bdb4a8c58c23e502836

[OOPIF] Early-out before dereferencing a potentially null LayoutObject. by chrishtr@chromium.org
https://chromium.googlesource.com/chromium/src/+/1645730cb6774130186146fb9588a039e8bd3acd

Add SubmitFrameMissing mojo call for WebVR/WebXR by klausw@chromium.org
https://chromium.googlesource.com/chromium/src/+/f5bbb138786c2236059c0fa28eaecf1e9f1d4a9e

Revert "Add SubmitFrameMissing mojo call for WebVR/WebXR" by klausw@chromium.org
https://chromium.googlesource.com/chromium/src/+/4029d625bb7290835ef25c56b12275f6f2294bcf

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Hi new folks - PTAL. The initial alert has 200 graphs that show a clear and sustained regression.

The pinpoint graphs aren't very clear - certainly not as stark as the graphs in comment 1. If you're blamed and your change is not possibly a factor, please say so ASAP. 

I'm especially suspicious of blaming *both* "Add SubmitFrameMissing mojo call for WebVR/WebXR" and the Revert of that same CL. Klausw@, lmk if I'm missing something. 


Cc: dalecur...@chromium.org crouleau@chromium.org
Dale, Caleb. FYI
^^^ this investigates whether a revert of pmonette@'s CL fixes the issue. That CL has been more clearly blamed for a win8 cpu regression in  Issue 830829 .
Project Member

Comment 16 by 42576172...@developer.gserviceaccount.com, Apr 13 2018

📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/160cbefcc40000

Revert "Adding fieldtrial config for the IncompatibleApplicationsWarning feature" by chcunningham@chromium.org
https://chromium-review.googlesource.com/c/chromium/src/+/1011208/1

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Owner: ----
Status: Available (was: Assigned)
I really don't see how this could be related to my mojo SubmitFrameMissing CL which was fully reverted after just a few minutes due to a build break, so I interpreted the range containing both as a no-op. Even if it were in, that code is only active during WebVR/WebXR presentation.
Owner: pmonette@chromium.org
Thanks klausw@ - agree. 

Assigning to pmonette@ - Comment 16 shows a significant improvement if we revert pmonette@'s CL. 
Labels: -Pri-2 Pri-1
Bumping to p1 given size of breadth of the regression. 
Status: Assigned (was: Available)
Cc: primiano@chromium.org
 Issue 830693  has been merged into this issue.
Just a note that this only affects dev & canary, where the experiment is active. Given this, I don't think assigning priority 1 to this issue is warranted.
There is no plan to enable the experiment to larger channels before addressing this issue.
Labels: -Pri-1 Pri-2
> There is no plan to enable the experiment to larger channels before addressing this issue.

Ack - p2 seems fair. 
Sorry for not responding earlier, I had a lot of similar bugs assigned to me and this one slipped off through the cracks

Increased CPU time is totally expected from my feature. But the extra work is scheduled on a background thread and should not interfere with foreground work.

In addition, the extra work I am doing is a one time occurrence, that happens shortly after startup. A series of tasks is posted sequentially for each loaded DLL in the process (InspectModule() in module_info_win,h), and each tasks takes less than a second. It may seem long but keep in mind that this runs on a background task. Our metrics shows that the median amount of loaded DLLs for users in the wild is 120.

So unless we see more pref regressions that doesn't involve CPU time, this is working as intended.
Sorry for my slow reply. 

Caleb, Dale: IMO analysis in comment 25 seems reasonable. From the description here*, cpu_time_percentage_avg is expected to spike even if the additional cpu time is going to a background task. 

Any concerns?

*https://chromium.googlesource.com/chromium/src/+/8a955ff676e6eecefb1ddc3ce9dd9e1ba78b60cd/docs/speed/benchmark/harnesses/power_perf.md

Cc: brucedaw...@chromium.org tdres...@chromium.org
+Bruce, Tim: Does this type of work seem reasonable? Perhaps a performance review of Patrick's design doc is warranted, unless someone else is already owning that.

Note also that this feature is going to obscure other performance regressions. All our benchmarks start Chrome and then immediately start running a performance test. So your background code is going to run for all of that. That will change the performance characteristics of the test run: I assume your feature will eat up any free cycles that are available within the first 120 DLLs * .5 seconds per DLL = ~60 seconds. Unless there is some reason to assume that your feature will not do this, then we will need to come up with some plan to not invalidate all the performance tests that we run. Patrick, could you please take the lead on this?
Cc: charliea@chromium.org
Cc: sullivan@chromium.org
Cc: gab@chromium.org
+gab: maybe this is something Lucky Luke can help with? See #25 and #27.
Cc: nednguyen@chromium.org
Where is the design doc for this feature? 0.5 seconds / DLL is heavier than I was initially imagining.
The design doc is here but is light on details.

https://docs.google.com/document/d/1iCQyTuTRG-2buDso421jjlyGgyWuXRTPPfF7FBICurI/edit?usp=sharing

gab@ mentioned in the past that they were interested in having a way to not run background tasks during startup.

I guess an easy solution to not mess up with our benchmark would be to delay the inspection of the loaded DLLs by a few minutes after startup.
But it sure would be nice if background tasks would not mess up performance benchmarks.
Background tasks cause CPU contention and power draw, so it makes sense that they impact performance benchmarks.

Thanks for the link to the doc! Can you provide comments access for chromium.org accounts?
Done.

What has been the usual solution for this in the past? I'm probably not the first one to add background work at startup?

Do you need a flag that disable all DLL inspection just for tests?
If these metrics are just saying "more CPU/power is used during startup" I don't think this is a metric we should be tracking.

Startup is a place where we want to use as much CPU as possible. Background tasks just means we try to yield resources to more important work, but if there are available resources, background tasks *will* run and this is WAI.

On startup we should aim to utilize 100% CPU so having CPU/power as a metric during startup doesn't make sense IMO.

If delayed/recurring background tasks are interfering with runtime metrics that's another story. For that Telemetry bots could use --disable-background-tasks to avoid noise at runtime (but definitely not for timing metrics as that would defeat the purpose of testing that the scheduler does a good job at organizing work).

For startup, we should only have metrics for timings, if we get there fast, it's not relevant whether we did other things on the side with extra resources.
Re #37, I generally agree with you, but we should also keep track of increases of the total amount of work we're doing. It sounds like this is adding an additional 60 CPU seconds of work to startup, which is pretty hefty.
Hmmm.. That is indeed pretty hefty. So is the idea : we track it, spot everything, and WontFix things that are reasonable (and this one potentially isn't)?

Comment 40 Deleted

60 seconds is when the tasks run at background priority. I don't think the tasks takes 100% of the core when running.

When they are executed at USER_VISIBLE priority, it takes more like 10 to 15 seconds.
10-15 seconds is still a very significant resource tax (not that I'm pronouncing an opinion on whether that tradeoff is worth it)
> On startup we should aim to utilize 100% CPU...

but for as short a time as possible. I know what you mean - that starting up as quickly as possible should be the main goal - but using more CPU time does come with consequences. It wastes battery power, it takes away CPU resources from other programs running on the same computer, it makes the fans spin faster, etc. 10-15 s of CPU time is enough, whether it is at startup or later on, that we should evaluate the cost/benefits to make sure that it is justified. That CPU time belongs to our customers.

Re #38:

> It sounds like this is adding an additional 60 CPU seconds of work to
> startup, which is pretty hefty.

The work by no means happens during startup. In fact, it's specifically *post* startup. We're using best practices right now in using a background-priority sequence and posting the initial task to PostAfterStartupTask. However, we could definitely be more aggressive. We could stagger the DLL inspections, spreading them out in time. But one way or another, the work still needs to be done.

There are a couple ways forward here:

(1) We could introduce a cache of these results. Most modules that are loaded are the same from run to run. I'd expect this to drop the expected cost of this down to effectively zero. Note that a run with a clean profile would still incur this cost, so this might not help out testing/benchmarking.

(2) Explore alternative ways of getting this data. We are using system APIs that for whatever reason do a silly amount of work under the hood, when we're only asking them to validate a certificate and give us to OU of the signer. There might be alternative APIs or 3rd party libraries for doing the same thing that are much more efficient.

As to the value of the feature itself I think that ship has sailed. The expected stability gains are on the order of 10-15%. Beyond that the entire effort is also about sending a message to the desktop developer ecosystem about what's kosher and what isn't.
I agree, I'm just saying that scheduling wise this is WAI. Whether this heavy workload should be scheduled in the first place is another topic. And on that I take back that these metrics shouldn't be -- I now see the point of monitoring the total amount of work we do, but I maintain that this isn't a scheduling problem.
It's great that this benchmark caught this change so that this discussion could take place. However, this benchmark is meant to measure media playback performance, so we don't want this DLL inspection code to obscuring that. The reason this benchmark starts up Chrome for each new test is so that the tests are independent. A side effect of that is that any work that is done for a new startup of Chrome ends up taking place during the test run. In order to remove that noise, it would be great if Chris's team could provide +Ned with a flag to disable this DLL inspection for the purposes of benchmarking (Or just disable the DLL inspection when the existing --enable-benchmarking flag is set.)
It's possible that preventing DLL inspection logic from running on the benchmarks is correct, but it's also pretty scary.

The more the Chrome that we test deviates from the Chrome users use in the wild, the more likely it is that performance issues will creep in.

Are we launching DLL inspection via Finch?

Is there any reason not to do (1) from comment #44?
Ned, would that help this case?
For (1) from comment #44, we can always start benchmark with a simple profile that would enable the cached results. Though I don't know enough about profile to judge how whether this approach is fragile.

The option of disabling the DLL inspection when the existing --enable-benchmarking flag is set is also good, but I also share Tim's concern about test deviates from what Chrome users are seeing. I would only go with this approach if (1) is unfeasible.
Components: Internals>Media

Sign in to add a comment