New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 876063 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocking:
issue 884705



Sign in to add a comment

Sampling profiler periodic samples are biased towards session resume

Project Member Reported by gab@chromium.org, Aug 20

Issue description

Chrome Version: All
OS: All (that sample)

I was looking @ https://b204354250-dev-wittman-dot-uma-hrd.googleplex.com/p/chrome/callstacks/?sid=eee55403e48b55843a9c487a6abaa630

and realized that the most notified ObserverList according to periodic samples is one that gets invoked when the connection changes (which should be infrequent but does happen on resume).

I then looked specifically at OnResume() stacks : https://b204354250-dev-wittman-dot-uma-hrd.googleplex.com/p/chrome/callstacks/?sid=5715f51661a38041a26df979a9d4fed0

and these should be fast and infrequent events (so even 0.07% is more than expected).

I suspect what's happening is that, on resume, the delay to the next sample has often expired while the machine was sleeping, so it gets scheduled right away and always samples resumes (causing a biased sample).

I think the sample profiler should be a PowerMonitorObsever and not send samples collected in the same range as a resume (waiting a bit before sending a sample to make sure it doesn't get a racy OnResume() notification right after).
 
Description: Show this description
Cc: chengx@chromium.org
Thanks for the bug. I suspected we had bias from samples collected on resume, but hadn't had the chance look into it. PowerObserver seems like an effective solution.
FWIW, resume is also a space where our performance is poor.

Instead of discarding samples on resume I think we should treat it like a whole new profiling phase (akin to startup). It's just wrong to put them in the "periodic" bucket, but the samples are definitely not worthless.
This sounds like something we could be supporting with metadata state to indicate that we're resuming. Do we have a notion in the code of when a resume is complete? If not perhaps we could just tag samples for a dedicated period after a resume.
I don't think there's a notion of "completing"; it's just a bunch of fire-and-forget notifications.

But yeah, arbitrarily capturing the first 30s after a resume makes sense.
I think this issue extends to all remaining samples in the profile, rather than just the next sample. the profiler effectively tries to schedule samples at 300 equal spaced 100ms intervals from the start of profiling. If the computer is suspended, on resume the profiler will try to churn through the backlog of samples that it thinks should have happened in the past.

We probably should just skip over any sample times that are more than one interval in the past.
Cc: wittman@chromium.org erikc...@chromium.org
Issue 884705 has been merged into this issue.
Labels: -Pri-2 Pri-1
Raising priority, given that this is causing Chrome to become unusable for periods of time.
Labels: -Pri-1 Pri-2
I will be OOO and unlikely to get to this until mid-October. Given that the performance impact of this issue is only affecting canary and dev channel I think we can reasonably downgrade to Pri-2.
Blocking: 884705

Sign in to add a comment