New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 627585 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jul 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Standardize which page loads are tracked across all page load metrics observers

Project Member Reported by bmcquade@chromium.org, Jul 12 2016

Issue description

The current PLM implementation has a concept of 'renderer tracked' which indicates whether the given page load is expected to report metrics from the render process. Determining if a page load is renderer tracked duplicates logic that's also enforced in the render process. The current implementation is:

bool IsRelevantNavigation(content::NavigationHandle* navigation_handle,
                          const GURL& browser_url,
                          const std::string& mime_type) {
  DCHECK(navigation_handle->HasCommitted());
  return navigation_handle->IsInMainFrame() &&
         !navigation_handle->IsSamePage() &&
         !navigation_handle->IsErrorPage() &&
         navigation_handle->GetURL().SchemeIsHTTPOrHTTPS() &&
         browser_url.SchemeIsHTTPOrHTTPS() &&
         (mime_type == "text/html" || mime_type == "application/xhtml+xml");
}

If a page meets these criteria, it is tracked (invokes PageLoadMetricsObserver callbacks such as OnComplete) and also reports timing metrics.

If a page does not meet these criteria, it is still tracked (invokes PageLoadMetricsObserver callbacks such as OnComplete) but does't report timing metrics.

Many page load observers filter out non-tracked navigations by returning early in OnComplete if the provided PageLoadTiming is empty, but this is non-obvious so some observers forget to do this, which results in them tracking e.g. non-HTML pages, but not receiving any metrics for those pages. See https://bugs.chromium.org/p/chromium/issues/detail?id=627536 for one example.

We should standardize the policy for which page loads should be tracked, and either track the page and expect metrics to be reported from the render process, or not track the page load at all (do not call OnComplete or any other callbacks).
 
Based on feedback from other observer implementers, it sounds like there's not a desire to track the following:
1. error pages (4xx, 5xx)
2. non-HTTP/HTTPS pages
3. non-HTML pages

There is interest in tracking prerenders once they become visible, but we'll continue to ignore them for now, and solve that outside of the scope of this change.

Based on the above, we'll update DidStartNavigation and DidFinishNavigation to check to see if the given NavigationHandle meets any of the above criteria for not being tracked, and if so, we'll either:
* In DidStartNavigation, not create a PageLoadTracker
* In DidFinishNavigation, we'll tell the PageLoadTracker that it should stop tracking metrics. If a PageLoadTracker has stopped tracking metrics, it will not invoke additional callbacks such as OnComplete on observers.
I'm nervous about this. We need to keep creating PageLoadTrackers for reliable abort metrics.

I think I'm happy with omitting OnComplete though.
Interesting, thanks! Can you remind me why we need to create trackers for abort metrics to work properly?
For instance, if a navigation to an image aborts another navigation, we won't be able to log accurate metrics if we don't create a PageLoadTracker after this CL lands:
https://codereview.chromium.org/2132603002
Ok, I think this may still end up working. In DidStartNavigation, the only properties we can use when deciding whether or not to track are whether the provisional URL is HTTP or HTTPS. We don't yet know whether the navigation is an error page or what its mime type is (we learn those at commit).

So we end up creating a PageLoadTracker in DidStartNavigation using basically the same policy we do now, with the one addition that we will not track if the provisional URL is HTTP or HTTPS. I think this seems reasonable though perhaps we want to keep it around in case a non-HTTP/HTTPS tracker commits and aborts a tracked load. What do you think?

Then in DidFinishNavigation, if we created a PageLoadTracker for the navigation in DidStartNavigation, we check to see if it's a non-HTTP/HTTPS page, an error page, or a non-HTML page. If so, we tell the PageLoadTracker to stop tracking metrics (don't call OnComplete).

One question I have: for aborts to work properly, do we need to keep these PageLoadTrackers that we tell to stop tracking metrics around after their navigations commit? My sense is that we don't, but I want to double check.
We're coming to a difficult question of "What pages do we want to log abort metrics for?" If we want to keep the same filtering it becomes hard to merge provisional and committed loads.

This implies we can't throw away PageLoadTrackers after commit.


My vision of the future here is that we no longer do any filtering in the renderer process (except for main frame), and only do filtering at one place, OnComplete dispatching (maybe).

This approach, plus limiting filtering of abort metrics will mean we don't need to separate BeforeCommit/AfterCommit abort variants.
I like the idea of not filtering in the render process, but in practice, if there are some page loads that none of our consumers are interested in, I don't think we should be reporting metrics for those pages in the browser process, and I think it's reasonable to ignore them in the render process as well. I do want to optimize for observers being unlikely to make tracking mistakes - the https engagement observer is one example of a tracker that's recording the wrong data due to the subtle misunderstanding around whether we call OnComplete for non-HTML pages.

I'm also less sure it's the right thing to track all aborts after commit, even if it's an error page / non HTML page / etc. That means those abort metrics aren't comparable with our core timing metrics which do want to filter these types of pages out.

Agree that the pre/post commit abort separation is undesirable though - agree we should move away from that because it's confusing to metric consumers. Perhaps it's reasonable to include all page loads before commit in our abort analysis, even if error page or non-HTML page, since neither we nor the user knows yet if it's going to be an error page, so it's still a 'valid' navigation at that point. Similarly, since the user experience before commit is the same for HTML and non HTML loads, maybe we want to include those in our metrics as well. Then post commit we start filtering those types of loads because they're not the kinds of pages that our team is interested in tracking. I'd also want to keep a counter of the number of commits we filter out due to the resource being non-HTML, error, etc, so we have an understanding of the magnitude of difference in volume as a result of our filtering. WDYT?
Your suggestion sounds good. I think I was struggling to find a good way to merge the provisional and committed aborts, but I think it is fine to merge them despite the latter coming from a filtered dataset. Your point about it appearing to the user the same is valid.
Thanks! Thinking about this a bit more, I tried to put together a summary of when PLMO callbacks are invoked, which I documented here:

https://docs.google.com/document/d/1HJsJ5W2H_3qRdqPAUgAEo10AF8gXPTXZLUET4X4_sII/edit?pref=2&pli=1#heading=h.44uujub59vi3

One thing that I think is confusing currently is that OnComplete is invoked for every PageLoadTracker, which means it gets called in a variety of contexts:
* successfully committed tracked load
* failed provisional load, or aborted provisional load
* error page navigation
* non-HTML page navigation
* same page navigation!

I'd like to provide separate callbacks for the various end states, so implementers can be more aware of which completion state they reached. I think we want completion callbacks for:
* successfully committed tracked load: call OnComplete
* failed provisional load: call OnFailedProvisionalLoad (we do this currently), but do not call OnComplete
* same page navigation: do not invoke any callbacks
* aborted provisional load: call OnAbortedProvisionalLoad (new callback we need to add). Unclear if we should also call OnFailedProvisionalLoad in this case, or only call one or the other. I think I'm inclined to call one or the other.
* error page navigation: do not invoke any callbacks (could add OnError in the future if someone needs it)
* non-HTML page navigation: do not invoke any callbacks (could add a callback for these if someone needs it)

I'm interested to get your feedback here. Thanks!
* aborted provisional load: call OnAbortedProvisionalLoad (new callback we need to add). Unclear if we should also call OnFailedProvisionalLoad in this case, or only call one or the other. I think I'm inclined to call one or the other.
 I vote for using OnFailedProvisionalLoad. We should make sure that information is passed to differentiate an abort and a failure.

The rest sound good. By making all callback only on committed, filtered loads we really make the API make sense. The only other we'll have is OnFailedProvisionalLoad I believe.
Agree that calling OnFailedProvisionalLoad in both of these cases makes sense.

OnFailedProvisionalLoad currently takes a NavigationHandle, but aborted provisional loads are often called back asynchronously, so we can't hang on to the NavigationHandle, which means the OnFailedProvisionalLoad args would have to change.

We only currently have a single implementer of this method, the core observer, which uses navigation_handle->GetNetErrorCode() and navigation_handle->NavigationStart().

We could record nav start in OnStart instead, and we could pass a PageLoadExtraInfo and an error code as args into OnFailedProvisionalLoad.

WDYT?
Yeah that sounds fine to me.
Great. I'm going to split this into 2 changes:
1. update the logic to decide which page loads are tracked, to filter out uninteresting page loads
2. update the OnComplete / OnFailedProvisionalLoad behavior, so we call one or the other on PageLoadTracker destruction
Blocking: -627536
Project Member

Comment 15 by bugdroid1@chromium.org, Jul 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/26615f40a69f11cee759a0b546061ea84fff8ca8

commit 26615f40a69f11cee759a0b546061ea84fff8ca8
Author: bmcquade <bmcquade@chromium.org>
Date: Fri Jul 15 17:00:23 2016

Standardize which page loads are tracked

The current PLM implementation has a concept of 'renderer tracked' which indicates
whether the given page load is expected to report metrics from the render process.

Many page load observers filter out non-tracked navigations by returning early in
OnComplete if the provided PageLoadTiming is empty, but this is non-obvious so some
observers forget to do this, which results in them tracking e.g. non-HTML pages, but
not receiving any metrics for those pages. See
https://bugs.chromium.org/p/chromium/issues/detail?id=627536 for one example.

Based on feedback from other observer implementers, there's a consistent request to
avoid tracking the following:
1. error pages (4xx, 5xx)
2. non-HTTP/HTTPS pages
3. non-HTML pages

Based on the above, this change gets rid of the 'renderer tracked' intermediate
state, and updates DidStartNavigation and DidFinishNavigation to check to see if
the given NavigationHandle meets any of the above criteria for not being tracked,
and if so:
* In DidStartNavigation, we do not create a PageLoadTracker
* In DidFinishNavigation, we'll tell the existing PageLoadTracker (if any) that it
should stop tracking metrics. If a PageLoadTracker has stopped tracking metrics, it
will not invoke additional callbacks such as OnComplete on observers.

Note that certain criteria, such as whether the page is HTML, or an error page,
(or even a same page navigation) are not known until commit, and thus we begin
tracking such page loads in DidStartNavigation but stop tracking at commit time.

BUG= 627585 

Review-Url: https://codereview.chromium.org/2139143002
Cr-Commit-Position: refs/heads/master@{#405774}

[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc
[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/chrome/browser/page_load_metrics/observers/https_engagement_metrics/https_engagement_page_load_metrics_observer_browsertest.cc
[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/chrome/browser/page_load_metrics/page_load_metrics_browsertest.cc
[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/metrics_web_contents_observer.cc
[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/metrics_web_contents_observer.h
[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/metrics_web_contents_observer_unittest.cc
[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/browser/page_load_metrics_observer.h
[modify] https://crrev.com/26615f40a69f11cee759a0b546061ea84fff8ca8/components/page_load_metrics/common/page_load_timing.h

Project Member

Comment 16 by bugdroid1@chromium.org, Jul 20 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0259d67099838b4b0f2aabcbd3108e9952d56204

commit 0259d67099838b4b0f2aabcbd3108e9952d56204
Author: bmcquade <bmcquade@chromium.org>
Date: Wed Jul 20 01:56:17 2016

Refactor PageLoadMetricsObserver completion callback policy

This patch modifies PageLoadMetricsObserver to invoke OnComplete for tracked
loads that committed, and OnFailedProvisionalLoad for tracked loads that did
not commit.

This makes it more straightforward for implementers to reason about which method
to override, and encourages separation of code for the different states into
methods which makes observer code more maintainable.

This patch also refactors the various observers to implement these methods as
needed.

BUG= 627585 

Review-Url: https://codereview.chromium.org/2152683004
Cr-Commit-Position: refs/heads/master@{#406454}

[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.h
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.h
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/document_write_page_load_metrics_observer.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.h
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/chrome/browser/page_load_metrics/page_load_metrics_browsertest.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/metrics_web_contents_observer.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/metrics_web_contents_observer.h
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/page_load_metrics_observer.cc
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/components/page_load_metrics/browser/page_load_metrics_observer.h
[modify] https://crrev.com/0259d67099838b4b0f2aabcbd3108e9952d56204/tools/metrics/histograms/histograms.xml

Status: Fixed (was: Started)

Sign in to add a comment