New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Many navigation related browser tests are flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, Aug 3

Issue description

"ContentBrowserTest.HistoryBackInUnloadCancelsReload" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 3 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyPgsSBUZsYWtlIjNDb250ZW50QnJvd3NlclRlc3QuSGlzdG9yeUJhY2tJblVubG9hZENhbmNlbHNSZWxvYWQM.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
 
This is failing along with a bunch of other navigation related browser tests.
Cc: creis@chromium.org nasko@chromium.org
Owner: palmer@chromium.org
Status: Assigned (was: Untriaged)
Summary: Some navigation related content_browsertests and viz_content_browsertests are timing out (was: "ContentBrowserTest.HistoryBackInUnloadCancelsReload" is flaky)
Unclear what's common in these tests, but I keep seeing this warning in failed tests before the timeout:

1004:4628:0806/064524.741:INFO:CONSOLE(1)] "Throttling history state changes to prevent the browser from hanging.", source:  (1)

Chris, any chance this is relevant?

See:

https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/54103
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win7%20Tests%20%28dbg%29%281%29/70720

Example failures:

NavigationControllerBrowserTest.ReplacedNavigationEntryData_PushState
NavigationControllerBrowserTest.OtherCommitDuringPendingEntryWithReplacement
ManifestBrowserTest.PushStateNavigation
NavigationHandleImplBrowserTest.VerifyRendererInitiated
NavigationHandleImplBrowserTest.VerifySameDocument
NavigationControllerBrowserTest.OriginChangeAfterDocumentWrite
RenderFrameHostManagerTest.EnsureUniversalAccessFromFileSchemeSucceeds
NavigationControllerBrowserTest.NavigationTypeClassification_NewPage
NavigationControllerBrowserTest.ForwardRedirectWithNoCommittedEntry
NavigationControllerBrowserTest.ReplacedNavigationEntryData_BackAfterReplaceStateWithRedirect
NavigationControllerBrowserTest.FrameNavigationEntry_BackWithRedirect
NavigationControllerBrowserTest.EnsureSamePageNavigationUpdatesFrameNavigationEntry
NavigationControllerBrowserTest.ConsecutiveReloadMetrics
NavigationControllerBrowserTest.ReplacedNavigationEntryData_ReplaceState
NavigationControllerBrowserTest.NavigationTypeClassification_On1SameDocumentTo2While2Pending
NavigationControllerBrowserTest.SiteInstanceChangeOnHistoryNavigation
NavigationControllerBrowserTest.NavigationTypeClassification_On1SameDocumentToXWhile1Pending
RenderFrameHostManagerTest.UnloadPushStateOnCrossProcessNavigation
NavigationControllerBrowserTest.FrameNavigationEntry_BackNestedAutoSubframe
SitePerProcessBrowserTest.SameDocumentNavigationDoesNotCommitPendingFramePolicy
NavigationControllerBrowserTest.SameDocumentNavigationDoesntDeleteNavigationHandle
WebContentsImplBrowserTest.LoadingStateChangedForSameDocumentNavigation
BrowserSideNavigationBrowserTest.IframeAndPushStateSimultaneously
NavigationControllerBrowserTest.NavigationTypeClassification_NewAndAutoSubframe
NavigationControllerBrowserTest.NavigationTypeClassification_ExistingPage
NavigationControllerBrowserTest.BackToAboutBlankIframe
NavigationControllerBrowserTest.SubframeForwardRedirect
SitePerProcessBrowserTest.ReplaceStateDoesNotCancelCrossSiteNavigation

NavigationControllerBrowserTest.NavigationTypeClassification_NewPage
_/BlobUrlBrowserTest.ReplaceStateToAddAuthorityToBlob/1
NavigationHandleImplBrowserTest.VerifyRendererInitiated
NavigationControllerBrowserTest.ForwardRedirectWithNoCommittedEntry
BrowserSideNavigationBrowserTest.IframeAndPushStateSimultaneously
NavigationControllerBrowserTest.EnsureSamePageNavigationUpdatesFrameNavigationEntry
NavigationControllerBrowserTest.ConsecutiveReloadMetrics
NavigationControllerBrowserTest.NavigationTypeClassification_ExistingPage
NavigationHandleImplBrowserTest.VerifySameDocument
NavigationControllerBrowserTest.SubframeBackFromReplaceState
ContentBrowserTest.HistoryBackInUnloadCancelsReload
SitePerProcessBrowserTest.SameDocumentNavigationDoesNotCommitPendingFramePolicy
Cc: roc...@chromium.org jam@chromium.org
Labels: OS-Android OS-Chrome OS-Fuchsia OS-Linux OS-Mac OS-Windows
Well, we recently landed https://chromium-review.googlesource.com/c/chromium/src/+/1161400 to make the throttling *more* generous, which would seem to reduce flakiness, not increase it. But, who knows. Do you experience the flakiness before that CL, or only after it?

We could try increasing `kMaxTokens` (https://chromium-review.googlesource.com/c/chromium/src/+/1161400/3/third_party/blink/renderer/core/frame/history.cc#160) as johnme suggested (https://bugs.chromium.org/p/chromium/issues/detail?id=786211). But that increases the risk of DoS somewhat.

Honestly, this throttling code is a disgusting hack and always has been. (I'm allowed to say that since I wrote it.) I'm not against removing it entirely, if we have a better solution. rockot is going to do some kind of IPC scheduling magic to prioritize the UI thread, as I understand it, which I think is supposed to be a more general solution to the problem of DoS via IPC flooding.
FWIW, I can't reproduce the flakiness locally with

./out/Default/content_browsertests --gtest_filter="NavigationControllerBrowserTest.*"

Is there a better invocation?
Cc: jonr...@chromium.org
Massive data races in these tests could be related, see  issue 847326 .
Detected 4 new flakes for test/step "ContentBrowserTest.HistoryBackInUnloadCancelsReload". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyPgsSBUZsYWtlIjNDb250ZW50QnJvd3NlclRlc3QuSGlzdG9yeUJhY2tJblVubG9hZENhbmNlbHNSZWxvYWQM. This message was posted automatically by the chromium-try-flakes app.
Cc: alex...@chromium.org iclell...@chromium.org
 Issue 872612  has been merged into this issue.
Cc: nick@chromium.org palmer@chromium.org cduvall@chromium.org
 Issue 872492  has been merged into this issue.
Status: Started (was: Assigned)
Attempting a further softening of the throttling policy in https://chromium-review.googlesource.com/c/chromium/src/+/1171825. I don't have mega confidence that it really solves the problem, of course.

I tend to think the data races are the real root cause, but this isn't my area of code familiarity so I could be wrong.
Summary: Many navigation related content_browsertests and viz_content_browsertests are flaky (was: Some navigation related content_browsertests and viz_content_browsertests are timing out)
Retitling, since not all tests are timeouts
 Issue 873363  has been merged into this issue.
@3: There's a whole ton of different kinds of tests that all seem to start failing early on August 3 (UTC), shortly after https://chromium-review.googlesource.com/c/chromium/src/+/1161400 landed.

While that change was intended to make throttling more generous, it seems plausible it had some sort of side effect tests are detecting.  Maybe there's a bug in the change of some sort.  Would you be willing to try just reverting it (and maybe un-disabling the whole tree of duped-in disabled tests) to see if the flakiness disappears?
Summary: Many navigation related browser tests are flaky (was: Many navigation related content_browsertests and viz_content_browsertests are flaky)
 Issue 873364  has been merged into this issue.
Cc: mea...@chromium.org f...@chromium.org
 Issue 872497  has been merged into this issue.
 Issue 872491  has been merged into this issue.
Labels: -Sheriff-Chromium
Assigned and active -> removing from sheriff queue
Project Member

Comment 19 by bugdroid1@chromium.org, Aug 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8e4127dd506c501cd3f51d91dc22a3dfa8f5f1ca

commit 8e4127dd506c501cd3f51d91dc22a3dfa8f5f1ca
Author: Chris Palmer <palmer@chromium.org>
Date: Sat Aug 11 01:26:55 2018

Change the history state throttle interval from 60 to 120 Hz.

This is a bit more lenient and may enable some callers, including a flaky test,
to get (more of) what they need. The test's root problem seems to be data races
(see  issue 847326 ), however.

Bug:  870861 
Change-Id: I249e9f06fa9a9a9f3b74e53cf2ef3f34b0c2371c
Reviewed-on: https://chromium-review.googlesource.com/1171825
Reviewed-by: Daniel Cheng <dcheng@chromium.org>
Commit-Queue: Chris Palmer <palmer@chromium.org>
Cr-Commit-Position: refs/heads/master@{#582415}
[modify] https://crrev.com/8e4127dd506c501cd3f51d91dc22a3dfa8f5f1ca/third_party/blink/renderer/core/frame/history.cc

 Issue 873425  has been merged into this issue.
Cc: mastiz@chromium.org rhalavati@chromium.org tschumann@chromium.org
 Issue 853151  has been merged into this issue.
Also, looking at the Win 7 bot, it doesn't look like r582415 was effective.

I'm going to try rolling back both the history.cc changes involved here and see if it helps.
Project Member

Comment 23 by bugdroid1@chromium.org, Aug 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5823b44aa85c3e803530cc77e9bd7429d523c575

commit 5823b44aa85c3e803530cc77e9bd7429d523c575
Author: Peter Kasting <pkasting@chromium.org>
Date: Sat Aug 11 05:51:46 2018

Revert "Change the history state throttle interval from 60 to 120 Hz."

This reverts commit 8e4127dd506c501cd3f51d91dc22a3dfa8f5f1ca.

Reason for revert: Part 1 of trying to revert the history throttling changes, to see if https://bugs.chromium.org/p/chromium/issues/detail?id=870861 improves.

Original change's description:
> Change the history state throttle interval from 60 to 120 Hz.
> 
> This is a bit more lenient and may enable some callers, including a flaky test,
> to get (more of) what they need. The test's root problem seems to be data races
> (see  issue 847326 ), however.
> 
> Bug:  870861 
> Change-Id: I249e9f06fa9a9a9f3b74e53cf2ef3f34b0c2371c
> Reviewed-on: https://chromium-review.googlesource.com/1171825
> Reviewed-by: Daniel Cheng <dcheng@chromium.org>
> Commit-Queue: Chris Palmer <palmer@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#582415}

TBR=palmer@chromium.org,dcheng@chromium.org,kinuko@chromium.org

Change-Id: I9fb7b5b0f8ebff8b6d6c5a772745cfd2c10ef51a
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  870861 
Reviewed-on: https://chromium-review.googlesource.com/1171708
Reviewed-by: Peter Kasting <pkasting@chromium.org>
Commit-Queue: Peter Kasting <pkasting@chromium.org>
Cr-Commit-Position: refs/heads/master@{#582432}
[modify] https://crrev.com/5823b44aa85c3e803530cc77e9bd7429d523c575/third_party/blink/renderer/core/frame/history.cc

Tentative results so far are that reverting these changes has greened things up.  Going to try un-disabling a lot of the disabled tests that were plausibly caused by this and see if they look greener.
 Issue 873438  has been merged into this issue.
Labels: Sheriff-Chromium
Detected 4 new flakes for test/step "NavigationControllerBrowserTest.BackToAboutBlankIframe". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyQQsSBUZsYWtlIjZOYXZpZ2F0aW9uQ29udHJvbGxlckJyb3dzZXJUZXN0LkJhY2tUb0Fib3V0QmxhbmtJZnJhbWUM. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Detected 4 new flakes for test/step "NavigationControllerBrowserTest.FrameNavigationEntry_BackNestedAutoSubframe". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyVgsSBUZsYWtlIktOYXZpZ2F0aW9uQ29udHJvbGxlckJyb3dzZXJUZXN0LkZyYW1lTmF2aWdhdGlvbkVudHJ5X0JhY2tOZXN0ZWRBdXRvU3ViZnJhbWUM. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
 Issue 873480  has been merged into this issue.
 Issue 873478  has been merged into this issue.
 Issue 873477  has been merged into this issue.
 Issue 873476  has been merged into this issue.
 Issue 873475  has been merged into this issue.
 Issue 873474  has been merged into this issue.
Cc: tdres...@chromium.org sullivan@chromium.org charliea@chromium.org nednguyen@chromium.org
 Issue 872024  has been merged into this issue.
Cc: mgiuca@chromium.org quacht@google.com dpa...@chromium.org dschuyler@chromium.org aboxhall@chromium.org hcarmona@chromium.org
 Issue 871692  has been merged into this issue.
 Issue 873479  has been merged into this issue.
Cc: hubbe@google.com
 Issue 872332  has been merged into this issue.
 Issue 871018  has been merged into this issue.
Status: Fixed (was: Started)
I looked through all dozen or so sets of runs after re-enabling the tests above.  None of them seem to be flaking.

I'm calling this fixed.  It looks like https://chromium-review.googlesource.com/c/chromium/src/+/1161400 was the root cause, and the attempt to loosen to 120 Hz did not help (some bots seemed to get even flakier as a result?).  I don't know what about the new algorithm triggered bad results, but maybe palmer can glean something from the issues here before landing further changes.

Sign in to add a comment