Histograms UI_OnCommitProvisionalLoad.Link and UI_OnLoadComplete.Link are inconsistent |
|||||
Issue descriptionThere is a naive assumption that navigation commits are faster than completed navigations. There is "population" bias here, of course, because not all committed navigations would finish loading. Nevertheless 95th %-ile is like: * 500 seconds to UI_OnCommitProvisionalLoad.Link * 25 seconds to UI_OnLoadComplete.Link Which still looks inconsistent, even with population bias. The UI_OnCommitProvisionalLoad.Link is bi-modal. The second hump is 90 seconds away from the first one. It is plenty of time for users to do something. This suggests that time being recorded starts from some click on a link but finishes with some unrelated navigation.
,
Mar 9 2016
Note that the start timestamp is issued here (HTMLAnchorElement::handleClick): https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/html/HTMLAnchorElement.cpp&rcl=1457512255&l=337 and is transfered from blink to content here (SendDidCommitProvisionalLoad): https://code.google.com/p/chromium/codesearch#chromium/src/content/renderer/render_frame_impl.cc&rcl=1457454493&l=4610 The final metric is recorded in OnDidCommitProvisionalLoad. I'm still not convinced that the start timestamp is not skewed by the fact that we're using platform timestamp. The only correct clock to initialize the value would be |monotonicallyIncreasingTime|. From the source I can't tell if that's always the case.
,
Mar 29 2016
,
Mar 30 2016
One more datapoint: the bimodality of Navigation.UI_OnCommitProvisionalLoad.Intent is quite similar (distance between humps: ~200sec vs. on _.Link: 80sec). re #2: Even though there could be some cross process time skew, it should not reach _seconds_ to _hundreds_ of seconds for large populations. bmcquade, csharrison: the metrics show wrong numbers, they are complex to get right. I am wondering whether the new infra you laid out with PageLoad.* (looking less multimodal) would 'magically' make these metrics more reliable? WDYT?
,
Mar 30 2016
From what I remember when landing this metrics, the start timestamp does not actually come from HTMLAnchorElement::handleClick. In fact, it is given part of the click event, which is built on the browser side by some Android code that handles touch input. This is why we considered at the time that the process skew should be a non-issue, since both start and stop would be recorded in the browser process (even though it involves some translation to blink internals in the middle). And as Egor mentioned, the intent version has the same issue, and both time stamps there are recorded in the browser process.
,
Mar 30 2016
#5 Ah that makes sense thanks for the clarification. That indeed makes me feel better because I was under the impression that some platform timestamps use a different time base issued from hardware drivers. If we're setting the timestamp our selves in Android code (presumably using the same monotonic clock as TimeTicks) then that shouldn't be an issue. #4 The PageLoad.* metrics shaved off most of the bimodal-ness by splitting page loads into a "foreground only" and "non foreground only" groups. Often pages that load in the background exhibit different performance, especially wrt delayed painting for background tabs. I'm not sure if that could be an issue here, though the skew range (~hundreds of seconds) seems like it could possibly be time spent in the background. If you look at PageLoad.Timing2.NavigationToFirstPaint.Background, you see a similar distribution. I don't have any evidence in the code that this is the case though.
,
May 19 2016
Have there been any updates on this? Given that we think the metric just may be wrong, should we remove it?
,
May 19 2016
Should we fix it if we think it is wrong? BTW, I thought we still don't know what went wrong yet.
,
Jul 1 2016
Ping on this again. Let's either fix it or rip the metric out, we don't want to be collecting misleading information for the next person to find :\
,
Jul 1 2016
I do think there's some value in tracking the delay between user interaction time and navigation start, which these histograms help to track. Ideally we'd track back even further to the time the user action first triggered an intent. Not sure if Android provides timestamps with that information as part of intent processing. It's worth noting that we see a large number of entries in this histogram in the last bucket (>10 minutes), on all platforms. I think it could be worth wiring up these user action timestamps to the PageLoad.* tracking to see if our filtering of e.g. non-top-frame navs, navs for non http/https URLs, and others, cause the strange distributions to become more believable. It's not totally clear to me how much work it would take to wire these up to PageLoad.*. To start, I think it's worth marking these histograms as deprecated so they don't show up in the dashboard by default, but I'd also like to try to wire the data into PageLoad.* and see if we think it looks sane once it goes through our filtering.
,
Sep 20 2017
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by pasko@chromium.org
, Mar 9 2016