IsolatedOriginTest.ProcessLimit intermittently fails to reuse a process (Linux Tests Builder) |
|||||||
Issue descriptionBuilder: https://build.chromium.org/p/chromium.linux/builders/Linux%20Tests First build failed: https://build.chromium.org/p/chromium.linux/builders/Linux%20Tests/builds/62278 Sample Failure link: https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.linux%2FLinux_Tests%2F62278%2F%2B%2Frecipes%2Fsteps%2Fsite_per_process_content_browsertests%2F0%2Flogs%2FIsolatedOriginTest.ProcessLimit%2F0
,
Sep 15 2017
So far I am not able to repro locally, but the failing test assertion seems to indicate that a process is not getting reused: // Navigate iframe back to isolated origin. See that it reuses the // |new_shell| process. NavigateIframeToURL(web_contents(), "test_iframe", isolated_foo_url); EXPECT_NE(foo_process, child->current_frame_host()->GetProcess()); EXPECT_EQ(isolated_foo_process, child->current_frame_host()->GetProcess()); <- THIS FAILS
,
Sep 15 2017
+kenrb@ who had better luck than me with reproing recent OOPIF-related regressions in layout tests - maybe the luck will also apply to this failure I've talked with alexmos@ and I'll put together a CL that disables the test on Linux - this should get this failure off of sheriff's dashboard and at the same time keep the coverage via Windows/Mac/ChromeOS.
,
Sep 15 2017
I have locally verified that reverting r502251 fixes the layout test failures. Apparently this is an issue with HTTP layout tests with both PlzNavigate and --site-per-process turned on.
,
Sep 15 2017
Oh, to clarify, I was looking specifically at the failures on the Site Isolation bot, which I presume are related: https://build.chromium.org/p/chromium.fyi/builders/Site%20Isolation%20Linux/builds/20859
,
Sep 15 2017
This flakiness also appears to be specific to just site_per_process_content_browsertests. I also see the test being flaky on linux_site_isolation. So let's disable it for --site-per-process mode and investigate? https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=site_per_process_content_browsertests&tests=IsolatedOriginTest.ProcessLimit vs https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=content_browsertests&tests=IsolatedOriginTest.ProcessLimit
,
Sep 15 2017
Thanks for catching this Alex (and helping narrow down the scenarios in which the test will be disabled). I note that the flakiness dashboard only shows failures on Linux bots, but this is most likely an artifact - CQ simply doesn't run tests with --site-per-process outside of the linux bots. CL for disabling the test when --site-per-process is on: https://chromium-review.googlesource.com/c/chromium/src/+/669037
,
Sep 15 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5f9683f8168038b8b6894169cc0bda3e59461e97 commit 5f9683f8168038b8b6894169cc0bda3e59461e97 Author: Lukasz Anforowicz <lukasza@chromium.org> Date: Fri Sep 15 19:55:45 2017 Disable IsolatedOriginTest.ProcessLimit test if --site-per-process is on Bug: 765711 Change-Id: Ia9068647f95847b2884c3488967cd3da5c96c58f Reviewed-on: https://chromium-review.googlesource.com/669037 Reviewed-by: Alex Moshchuk <alexmos@chromium.org> Commit-Queue: Łukasz Anforowicz <lukasza@chromium.org> Cr-Commit-Position: refs/heads/master@{#502335} [modify] https://crrev.com/5f9683f8168038b8b6894169cc0bda3e59461e97/content/browser/isolated_origin_browsertest.cc
,
Sep 15 2017
Removing Sheriff-Chromium as the test is disabled.
,
Sep 15 2017
,
Oct 5 2017
I also couldn't repro this locally, but here's one theory on why this could be happening. There's a site-per-process-specific behavior quirk that's described a little higher up in the test, which causes the isolated.foo.com navigation in the second tab to create a new process when run with --site-per-process: // TODO(alexmos): with --site-per-process, this won't currently reuse the // subframe process, because the new SiteInstance will initialize its // process while it still has no site (during CreateBrowser()), and since // dedicated processes can't currently be reused for a SiteInstance with no // site, this creates a new process. The subsequent navigation to // |isolated_foo_url| stays in that new process without consulting whether it // can now reuse a different process. This should be fixed; see // https://crbug.com/513036 . Without --site-per-process, this works because // the site-less SiteInstance is allowed to reuse the first tab's foo.com // process (which isn't dedicated), and then it swaps to the isolated.foo.com // process during navigation. if (!AreAllSitesIsolatedForTesting()) EXPECT_EQ(child->current_frame_host()->GetProcess(), isolated_foo_process); This basically means that at this point, with --site-per-process, we might have two isolated origin processes. When we navigate the subframe in the first tab away from isolated.foo.com [1], the first of those processes should be destroyed, but maybe it's not destroyed in time (if the pending delete RFH is still around) before we navigate that subframe back to isolated.foo.com in [2]. Then, we will have two isolated.foo.com processes around, and might pick the unexpected one for process reuse. If that's the case, this is a race in the test and not a new bug in the process reuse logic. [1] https://cs.chromium.org/chromium/src/content/browser/isolated_origin_browsertest.cc?l=429&rcl=40ab7f76c55635190680c68a93e727b3684e1d37 [2] https://cs.chromium.org/chromium/src/content/browser/isolated_origin_browsertest.cc?l=437&rcl=40ab7f76c55635190680c68a93e727b3684e1d37
,
Oct 5 2017
Oh, I now have a local repro (I think I might've forgotten to reenable the test for --site-per-process before running, duh) and confirmed that the subframe's old isolated.foo.com process is getting reused, instead of the new_shell isolated.foo.com process. So this is indeed a race in the test. Fix on the way.
,
Oct 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/fadadd12dec0f327ca792cc996c248eb0222c676 commit fadadd12dec0f327ca792cc996c248eb0222c676 Author: Alex Moshchuk <alexmos@chromium.org> Date: Thu Oct 05 16:00:55 2017 Fix race with --site-per-process in IsolatedOriginTest.ProcessLimit See https://crbug.com/765711#c12 for the description of the race. tldr is that with --site-per-process, we might have two isolated.foo.com processes eligible for reuse if one of them doesn't die in time, and we pick the one that the test doesn't expect for navigating the subframe back to isolated.foo.com. The fix is to wait for the subframe's old isolated.foo.com RFH to be destroyed before navigating it back to isolated.foo.com. This ensures there's only one process that can be reused. Bug: 765711 Change-Id: I23740e1f4e1aa7882f977300a5000630d49d9efe Reviewed-on: https://chromium-review.googlesource.com/701289 Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org> Commit-Queue: Alex Moshchuk <alexmos@chromium.org> Cr-Commit-Position: refs/heads/master@{#506743} [modify] https://crrev.com/fadadd12dec0f327ca792cc996c248eb0222c676/content/browser/isolated_origin_browsertest.cc
,
Oct 5 2017
The test has been all green since being fixed and re-enabled for --site-per-process in r506743: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=site_per_process_content_browsertests&tests=IsolatedOriginTest.ProcessLimit |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by kelv...@chromium.org
, Sep 15 2017