New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 599630 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Apr 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 1
Type: Bug



Sign in to add a comment

Consider tracking scroll start metrics during load separately than post-load

Project Member Reported by rbyers@chromium.org, Mar 31 2016

Issue description

In https://bugs.chromium.org/p/chromium/issues/detail?id=599609#c4 klobag@ argues that anecdotal experience of the scroll start delay is much worse than the "5% of scrolls take longer than 100ms to start" measure we've been talking about publicly.  Perhaps this is because scrolling during load is much worse.

Should we perhaps try to track the metrics during load separately?  Obviously "during load" is a fuzzy concept, but we could at least use one of the approximate signals the loading team uses (even just the document 'load' event), or even just "within the first 5 seconds" or something like that.

tdresser@ WDYT?
 
Cc: briander...@chromium.org
Labels: -OS-Linux OS-All
We've discussed the generalization of this some - ideally we'd be able to break up all metrics by RAIL stage.

It's unclear to me what the value is here though. What decisions would this information give us additional insight into?

We already know that the main thread is much more congested during load, and is the source of a significant fraction of scroll jank. If we want to figure out proportionally how much of the time scroll jank is due to page load, I suspect that deep reports would be a better approach.

A metric looking at how busy the main thread is during page load vs outside of pageload would communicate approximately the same thing, but might be more broadly useful. I'm not sure what decisions would rely on that data either.
Agreed it's not clear what having this data buys us beyond just mild interest (for which a one-off deep report analysis is probably better).  This is why we've never bothered breaking this up before - we already know the problem is bad and we're working to solve it in a way that doesn't depend on loading state.

Grace, any thoughts?

I do think we may want to take some aspect of loading state into account for the intervention ( issue 599609 ), but again we're doing lots of experimenting on analysis there where iterating on deep report data makes much more sense than UMA (though I imagine we'll add some UMA metrics once we have a concrete proposal).
I was looking for a number which can guide us to know whether we should apply user intervention or not. Single digit variance always fall into the noise, while we believe there is a real problem in this case. That is why I was looking for loading state only. If we have plan to break up metrics by RAIL, this request should come free, right?
We determined that dividing metrics up based on RAIL is pretty difficult until we have trace based UMA metrics, which will be a long time.

"Single digit variance always fall into the noise"
The 95'th percentile of our scroll start delay metric for Nexus 5 has a maximum 15% range, and some milestones have a range as low as 3%. (Looking at the data per day).

I think this is stable enough.

I don't think we'd get a stabler signal looking only at loading. We'd have fewer samples, and we'd probably introduce some noise with our heuristic that detects when we're loading.
IMHO we have enough data to say we should definitely be doing an intervention here. Even the 99th percentile is very stable (eg. very little variation day-to-day on stable channel for a given popular device).  And when we're talking about something as bad as a >300ms stall, even occurring in 1% of scrolls is hugely disruptive to the overall experience.

So tl;dr, we've known for over a year that we have a serious problem here, and we've been executing on a long-term plan to address it while minimizing developer pain (shipping "touch-action" a couple years back was the first big step).  And we're finally now (with passive event listeners shipping) at the point that an intervention is pratical, and that's the top priority for input-dev.  I don't think more UMA stats on this would change our plan/priorities.

Comment 6 Deleted

I am fine to close this as wont fix.
Status: WontFix (was: Assigned)

Sign in to add a comment