Navigation Timing (performance.timing) intervals are below 0 for a high % of modern browsers |
||
Issue description
Chrome Version : all versions
URLs (if applicable) : multiple URLs
Other browsers tested: all browsers
Add OK or FAIL, along with the version, after other browsers where you
have tested this issue:
Safari: OK
Firefox: FAIL
Edge: OK
What steps will reproduce the problem?
(1) Develop a pipeline that reports + sends back performance.timing entries
(2) Use the entry to compute certain intervals
(3) Observe that a high percentage of entries are reporting intervals that do not appear to make sense
What is the expected result?
The Navigation Timing interface lets you calculate intervals based upon a timeline. The following intervals should never be reported as below 0:
- server response time, calculated as performance.timing.responseStart - performance.timing.navigationStart
- navigation to fetch time, calculated as performance.timing.fetchStart - performance.timing.navigationStart
What happens instead?
The Boq Web infrastructure uses performance data to compute intervals for latency metrics. We see that several browser permutations report these intervals as below 0 for a non-neglegible percent of the time. I analyzed data from Friday 8/24/18 across 10s of billions of reports to find these issues.
- server response time:
- Older Chrome versions (18-21) on Windows Desktop report below 0 3-13$ of the time
- Newer Chrome versions on Windows Desktop throw around 1% false reports
- On Android, depending on the version just under 0.5% (7.1) to 1.5% (7.0) to over 9% (5.0)
- navigation to fetch time:
- Newer Chrome versions on Windows are throwing between 2% and 5% below 0
- Android 5.0 is the only version of Android which is throwing more than 0.5%; it threw about 1.5%
We have not been able to reproduce this locally at this time and in fact I was hoping that browser vendors would have knowledge of this issue, or possible tools to investigate this further.
Some info on why we think this is concerning:
- navigationStart should be the earliest timestamp in performance.timing. It is the event that kicks off all the other events.
- For example here is a diagram which correlates with my understanding of the ordering of these events, with navStart at the beginning.
- https://nicj.net/navigationtiming-in-practice/
- Latency metrics which are measured server-side would never have issues like this where SRT < 0 (which is a metric intended to catch slow servers)
- Website owners which are considering migrating to our framework and have existing server-side solutions may see the client-side approach as flaky / inferior when modern browsers throw so many results which have to be discarded.
All the data I used for this report is at https://docs.google.com/spreadsheets/d/1gNiwIc5Y6prz9iWgU9Ilr-m3wcyf2UUILKL3FpnSsNU/edit#gid=208180716
- Additionally I am working on a high-level report which will reference the data as well as this bug, and will update here when done.
I mentioned that Firefox also fails this check. To clarify, only older versions of Firefox (24.0, 32.0, and 48.0) are throwing at a non-neglegible rate and current versions do not appear to have this issue.
Let me know if there is any other information I could include that would make this more helpful or actionable.
Thanks!
,
Aug 29
Are there specific websites where this happens, or is it flaky and happening on any website? Also, does this bug have any confidential information other than the doc which already has restricted access? Otherwise I can remove the Restrict-View-Google label.
,
Aug 29
It's flaky + happening on any website. I don't see any other confidential info other than the doc.
,
Aug 30
,
Nov 20
Hello, FWIW I also noticed unexpected negative values in our analytics, when mixing nav timing + JS timestamps.
In top of HTML, I have an inline JS:
window._myStartTime = performance.now()
and later I measure through the code which can be simplified to:
Math.round( window._myStartTime - performance.getEntriesByType('navigation')[0]['requestStart'] )
For about 0.01% of traffic this yields negative values.
When it comes to Chrome, I found two "buckets" of negative values:
1. Huge negative values on Windows 7 and Windows 8.1, on 64 bit systems.
(looks like some kind of overflow?)
2. Slightly negative values on iOS
(probably upstream webkit bug?)
Sample raw data:
Chrome 67.0.3396.99 Windows 8.1 -8,589,934,027
Chrome 70.0.3538.77 Windows 8.1 -4,294,967,060
Chrome 70.0.3538.102 Windows 8.1 -4,294,967,053
Chrome 70.0.3538.77 Windows 8.1 -4,294,966,165
Chrome 70.0.3538.102 Windows 8.1 -4,294,965,704
Chrome 69.0.3497.100 Windows 7 -4,294,965,393
Chrome 68.0.3440.106 Windows 7 -4,294,957,928
Chrome 70.0.3538.102 Windows 8.1 -4,294,955,077
Chrome 70.0.3538.102 Windows 8.1 -4,294,953,626
Chrome Mobile 63.0.3239.73 iOS 9 -12,356
Chrome Mobile 63.0.3239.73 iOS 9 -11,995
Chrome Mobile 68.0.3440.83 iOS 12 -1,382
Chrome Mobile 70.0.3538.75 iOS 11 -1,294
Chrome Mobile 69.0.3497.105 iOS 12 -1,21
Chrome Mobile 70.0.3538.75 iOS 10 -938
Chrome Mobile 58.0.3029.113 iOS 9 -281
Chrome Mobile 61.0.3163.73 iOS 9 -64
,
Nov 20
Regarding my previous comment: the small negative values seem to show up only on webkit derivatives (Safari, Mobile Safari, Chrome iOS etc.), whereas huge negative values only happen on Chrome on Windows <= 8.1. |
||
►
Sign in to add a comment |
||
Comment 1 by ricea@chromium.org
, Aug 29Labels: Pri-2 Type-Bug