Standardize which page loads are tracked across all page load metrics observers |
|||
Issue description
The current PLM implementation has a concept of 'renderer tracked' which indicates whether the given page load is expected to report metrics from the render process. Determining if a page load is renderer tracked duplicates logic that's also enforced in the render process. The current implementation is:
bool IsRelevantNavigation(content::NavigationHandle* navigation_handle,
const GURL& browser_url,
const std::string& mime_type) {
DCHECK(navigation_handle->HasCommitted());
return navigation_handle->IsInMainFrame() &&
!navigation_handle->IsSamePage() &&
!navigation_handle->IsErrorPage() &&
navigation_handle->GetURL().SchemeIsHTTPOrHTTPS() &&
browser_url.SchemeIsHTTPOrHTTPS() &&
(mime_type == "text/html" || mime_type == "application/xhtml+xml");
}
If a page meets these criteria, it is tracked (invokes PageLoadMetricsObserver callbacks such as OnComplete) and also reports timing metrics.
If a page does not meet these criteria, it is still tracked (invokes PageLoadMetricsObserver callbacks such as OnComplete) but does't report timing metrics.
Many page load observers filter out non-tracked navigations by returning early in OnComplete if the provided PageLoadTiming is empty, but this is non-obvious so some observers forget to do this, which results in them tracking e.g. non-HTML pages, but not receiving any metrics for those pages. See https://bugs.chromium.org/p/chromium/issues/detail?id=627536 for one example.
We should standardize the policy for which page loads should be tracked, and either track the page and expect metrics to be reported from the render process, or not track the page load at all (do not call OnComplete or any other callbacks).
,
Jul 12 2016
I'm nervous about this. We need to keep creating PageLoadTrackers for reliable abort metrics. I think I'm happy with omitting OnComplete though.
,
Jul 12 2016
Interesting, thanks! Can you remind me why we need to create trackers for abort metrics to work properly?
,
Jul 12 2016
For instance, if a navigation to an image aborts another navigation, we won't be able to log accurate metrics if we don't create a PageLoadTracker after this CL lands: https://codereview.chromium.org/2132603002
,
Jul 12 2016
Ok, I think this may still end up working. In DidStartNavigation, the only properties we can use when deciding whether or not to track are whether the provisional URL is HTTP or HTTPS. We don't yet know whether the navigation is an error page or what its mime type is (we learn those at commit). So we end up creating a PageLoadTracker in DidStartNavigation using basically the same policy we do now, with the one addition that we will not track if the provisional URL is HTTP or HTTPS. I think this seems reasonable though perhaps we want to keep it around in case a non-HTTP/HTTPS tracker commits and aborts a tracked load. What do you think? Then in DidFinishNavigation, if we created a PageLoadTracker for the navigation in DidStartNavigation, we check to see if it's a non-HTTP/HTTPS page, an error page, or a non-HTML page. If so, we tell the PageLoadTracker to stop tracking metrics (don't call OnComplete). One question I have: for aborts to work properly, do we need to keep these PageLoadTrackers that we tell to stop tracking metrics around after their navigations commit? My sense is that we don't, but I want to double check.
,
Jul 13 2016
We're coming to a difficult question of "What pages do we want to log abort metrics for?" If we want to keep the same filtering it becomes hard to merge provisional and committed loads. This implies we can't throw away PageLoadTrackers after commit. My vision of the future here is that we no longer do any filtering in the renderer process (except for main frame), and only do filtering at one place, OnComplete dispatching (maybe). This approach, plus limiting filtering of abort metrics will mean we don't need to separate BeforeCommit/AfterCommit abort variants.
,
Jul 13 2016
I like the idea of not filtering in the render process, but in practice, if there are some page loads that none of our consumers are interested in, I don't think we should be reporting metrics for those pages in the browser process, and I think it's reasonable to ignore them in the render process as well. I do want to optimize for observers being unlikely to make tracking mistakes - the https engagement observer is one example of a tracker that's recording the wrong data due to the subtle misunderstanding around whether we call OnComplete for non-HTML pages. I'm also less sure it's the right thing to track all aborts after commit, even if it's an error page / non HTML page / etc. That means those abort metrics aren't comparable with our core timing metrics which do want to filter these types of pages out. Agree that the pre/post commit abort separation is undesirable though - agree we should move away from that because it's confusing to metric consumers. Perhaps it's reasonable to include all page loads before commit in our abort analysis, even if error page or non-HTML page, since neither we nor the user knows yet if it's going to be an error page, so it's still a 'valid' navigation at that point. Similarly, since the user experience before commit is the same for HTML and non HTML loads, maybe we want to include those in our metrics as well. Then post commit we start filtering those types of loads because they're not the kinds of pages that our team is interested in tracking. I'd also want to keep a counter of the number of commits we filter out due to the resource being non-HTML, error, etc, so we have an understanding of the magnitude of difference in volume as a result of our filtering. WDYT?
,
Jul 13 2016
Your suggestion sounds good. I think I was struggling to find a good way to merge the provisional and committed aborts, but I think it is fine to merge them despite the latter coming from a filtered dataset. Your point about it appearing to the user the same is valid.
,
Jul 13 2016
Thanks! Thinking about this a bit more, I tried to put together a summary of when PLMO callbacks are invoked, which I documented here: https://docs.google.com/document/d/1HJsJ5W2H_3qRdqPAUgAEo10AF8gXPTXZLUET4X4_sII/edit?pref=2&pli=1#heading=h.44uujub59vi3 One thing that I think is confusing currently is that OnComplete is invoked for every PageLoadTracker, which means it gets called in a variety of contexts: * successfully committed tracked load * failed provisional load, or aborted provisional load * error page navigation * non-HTML page navigation * same page navigation! I'd like to provide separate callbacks for the various end states, so implementers can be more aware of which completion state they reached. I think we want completion callbacks for: * successfully committed tracked load: call OnComplete * failed provisional load: call OnFailedProvisionalLoad (we do this currently), but do not call OnComplete * same page navigation: do not invoke any callbacks * aborted provisional load: call OnAbortedProvisionalLoad (new callback we need to add). Unclear if we should also call OnFailedProvisionalLoad in this case, or only call one or the other. I think I'm inclined to call one or the other. * error page navigation: do not invoke any callbacks (could add OnError in the future if someone needs it) * non-HTML page navigation: do not invoke any callbacks (could add a callback for these if someone needs it) I'm interested to get your feedback here. Thanks!
,
Jul 13 2016
* aborted provisional load: call OnAbortedProvisionalLoad (new callback we need to add). Unclear if we should also call OnFailedProvisionalLoad in this case, or only call one or the other. I think I'm inclined to call one or the other. I vote for using OnFailedProvisionalLoad. We should make sure that information is passed to differentiate an abort and a failure. The rest sound good. By making all callback only on committed, filtered loads we really make the API make sense. The only other we'll have is OnFailedProvisionalLoad I believe.
,
Jul 13 2016
Agree that calling OnFailedProvisionalLoad in both of these cases makes sense. OnFailedProvisionalLoad currently takes a NavigationHandle, but aborted provisional loads are often called back asynchronously, so we can't hang on to the NavigationHandle, which means the OnFailedProvisionalLoad args would have to change. We only currently have a single implementer of this method, the core observer, which uses navigation_handle->GetNetErrorCode() and navigation_handle->NavigationStart(). We could record nav start in OnStart instead, and we could pass a PageLoadExtraInfo and an error code as args into OnFailedProvisionalLoad. WDYT?
,
Jul 13 2016
Yeah that sounds fine to me.
,
Jul 15 2016
Great. I'm going to split this into 2 changes: 1. update the logic to decide which page loads are tracked, to filter out uninteresting page loads 2. update the OnComplete / OnFailedProvisionalLoad behavior, so we call one or the other on PageLoadTracker destruction
,
Jul 15 2016
,
Jul 15 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/26615f40a69f11cee759a0b546061ea84fff8ca8 commit 26615f40a69f11cee759a0b546061ea84fff8ca8 Author: bmcquade <bmcquade@chromium.org> Date: Fri Jul 15 17:00:23 2016 Standardize which page loads are tracked The current PLM implementation has a concept of 'renderer tracked' which indicates whether the given page load is expected to report metrics from the render process. Many page load observers filter out non-tracked navigations by returning early in OnComplete if the provided PageLoadTiming is empty, but this is non-obvious so some observers forget to do this, which results in them tracking e.g. non-HTML pages, but not receiving any metrics for those pages. See https://bugs.chromium.org/p/chromium/issues/detail?id=627536 for one example. Based on feedback from other observer implementers, there's a consistent request to avoid tracking the following: 1. error pages (4xx, 5xx) 2. non-HTTP/HTTPS pages 3. non-HTML pages Based on the above, this change gets rid of the 'renderer tracked' intermediate state, and updates DidStartNavigation and DidFinishNavigation to check to see if the given NavigationHandle meets any of the above criteria for not being tracked, and if so: * In DidStartNavigation, we do not create a PageLoadTracker * In DidFinishNavigation, we'll tell the existing PageLoadTracker (if any) that it should stop tracking metrics. If a PageLoadTracker has stopped tracking metrics, it will not invoke additional callbacks such as OnComplete on observers. Note that certain criteria, such as whether the page is HTML, or an error page, (or even a same page navigation) are not known until commit, and thus we begin tracking such page loads in DidStartNavigation but stop tracking at commit time. BUG= 627585 Review-Url: https://codereview.chromium.org/2139143002 Cr-Commit-Position: refs/heads/master@{#405774} [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/chrome/browser/page_load_metrics/observers/https_engagement_metrics/https_engagement_page_load_metrics_observer_browsertest.cc [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/chrome/browser/page_load_metrics/page_load_metrics_browsertest.cc [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/metrics_web_contents_observer.cc [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/metrics_web_contents_observer.h [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/metrics_web_contents_observer_unittest.cc [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/page_load_metrics_observer.h [modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/common/page_load_timing.h
,
Jul 20 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0259d67099838b4b0f2aabcbd3108e9952d56204 commit 0259d67099838b4b0f2aabcbd3108e9952d56204 Author: bmcquade <bmcquade@chromium.org> Date: Wed Jul 20 01:56:17 2016 Refactor PageLoadMetricsObserver completion callback policy This patch modifies PageLoadMetricsObserver to invoke OnComplete for tracked loads that committed, and OnFailedProvisionalLoad for tracked loads that did not commit. This makes it more straightforward for implementers to reason about which method to override, and encourages separation of code for the different states into methods which makes observer code more maintainable. This patch also refactors the various observers to implement these methods as needed. BUG= 627585 Review-Url: https://codereview.chromium.org/2152683004 Cr-Commit-Position: refs/heads/master@{#406454} [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.h [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.h [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/document_write_page_load_metrics_observer.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.h [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/page_load_metrics_browsertest.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/metrics_web_contents_observer.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/metrics_web_contents_observer.h [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/page_load_metrics_observer.cc [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/page_load_metrics_observer.h [modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/tools/metrics/histograms/histograms.xml
,
Jul 29 2016
|
|||
►
Sign in to add a comment |
|||
Comment 1 by bmcquade@chromium.org
, Jul 12 2016