WebContentsObserverSanityChecker assertions can be violated with a loading OOPIF, two DidStartLoading events in a row |
|||
Issue description
I'm not sure if this is really a bug -- it might just an overzealous assertion in WebContentsObserverSanityChecker -- but I just want to dump some state since it's Friday afternoon.
WebContentsObserverSanityChecker asserts that DidStartLoading should never be called twice in a row without DidStopLoading being called in between, but that assumption can be violated when the page has an OOPIF that has not finished loading at the time that the top frame process crashes. In this case, the top frame gets marked as not loading when RenderProcessGone fires, but the subframe never gets marked as not loading, so the WebContents never fires DidStopLoading to its observers.
Here's a unit test I wrote that illustrates this (the fact that it's in ContentSubresourceFilterThrottleManagerTest is unimportant, that's just where I happened to run into this):
TEST_P(ContentSubresourceFilterThrottleManagerTest,
SubframeTest) {
NavigateAndCommitMainFrame(GURL(kTestURLWithActivation));
CreateSubframeWithTestNavigation(
GURL("https://www.example.first/"), main_rfh());
SimulateStartAndExpectResult(content::NavigationThrottle::PROCEED);
SimulateCommitAndExpectResult(content::NavigationThrottle::PROCEED);
process()->SimulateCrash();
NavigateAndCommitMainFrame(GURL("https://example.reset"));
}
This test hits the DCHECK in WebContentsObserverSanityCheck::DidStartLoading during the final navigation to example.reset.
To check that it can happen outside of tests, I hacked in the following:
1.) Suppress DidStopLoading messages from child frames to simulate a slow-to-load frame:
--- a/content/renderer/render_frame_impl.cc
+++ b/content/renderer/render_frame_impl.cc
@@ -5451,7 +5451,8 @@ void RenderFrameImpl::DidStopLoading() {
SendUpdateFaviconURL(icon_types_mask);
render_view_->FrameDidStopLoading(frame_);
- Send(new FrameHostMsg_DidStopLoading(routing_id_));
+ if (!frame_->Parent())
+ Send(new FrameHostMsg_DidStopLoading(routing_id_));
}
2.) Add logging or breakpoints in WebContentsImpl::LoadingStateChanged right before firing DidStartLoading and DidStopLoading on its observers.
3.) Navigate to any page and use DevTools to insert an OOPIF into the page.
4.) Use Task Manager to kill the top frame process.
5.) Navigate the tab to somewhere else. Observe two DidStartLoading events in a row without a DidStopLoading event between them.
,
Feb 8 2018
As background: Relaxing WCOSanityChecker is always an option. WCOSanityChecker is really about providing some kind of API contract guarantees to the production WCO implementations, so that can be implemented more simply. It's supposed to reflect the invariants that actual WebContentsObservers do rely on in practice. Each frame has a loading state, but my understanding is that the WCO::DidStartLoading is about the webcontents-wide notion of loading state (to first approximation, an or-ing together of the frame states). WCO::DidStartLoading ought to really only be sent, then, when the return value of WebContentsImpl::IsLoading() transitions from false to true. I think that's the underlying motivation of the check in WCOSanityChecker. My gut instinct is that the per-frame DidStopLoading message should not be suppressed; we might instead just update WebContentsImpl::LoadingStateChanged() to filter out the notification when a second frame starts loading but the tab is already in the loading state. If we change web_contents_impl, we should also do a quick audit of the production implementations of WebContentsObserver, to see if there are places that strongly assume that DidStartLoading/DidStopLoading don't arrive unbalanced (because these might have bugs), or those that expect them to arrive for each loading frame...
,
Feb 9 2018
FWIW, we're arthursonzogni is also working on improvements to DidStart/StopLoading, in particular to reduce the number of cases where we get it wrong (issue 809945).
,
Mar 6 2018
,
Dec 19
Is there a plan to fix this? I'm concerned this may be the root cause for random flaky tests. While sheriffing today, I saw a couple tests that both failed due to a failure in web_contents_observer_sanity_checker.cc. One is an instance of bug 915360 with ForceMaximizeOnFirstRunTest, and the other is with PasswordManagerBrowserTest here: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20CFI/11992
,
Dec 19
FWIW, my CL was reverted for the PasswordManagerBrowserTest flakiness (https://bugs.chromium.org/p/chromium/issues/detail?id=916413) and was probably the "culprit". This test does *NOT* use OOPIFs. I did change the timing of form submissions, though, and I suspect this exposed a race condition. I haven't been able to reproduce locally, though, so all I have to go off of is the stack trace here: https://chromium-swarm.appspot.com/task?id=41de9fdc93137310&refresh=10&show_raw=1
,
Jan 9
clamy, should this get duped into issue 466089? |
|||
►
Sign in to add a comment |
|||
Comment 1 by est...@chromium.org
, Feb 8 2018