New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 803367 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Chrome , Mac
Pri: 2
Type: Bug

Blocking:
issue 787576



Sign in to add a comment

Site Isolation: Unexpected process sharing for main frames visited after OOPIFs

Project Member Reported by creis@chromium.org, Jan 18 2018

Issue description

Chrome Version: 65.0.3320.0
OS: Linux

What steps will reproduce the problem?
(1) Start Chrome with --site-per-process
(2) Visit https://www.chromium.org in one tab.
(3) In a second tab, visit https://csreis.github.io/tests/cross-site-iframe.html
(4) Click "Go cross-site (complex page)" to load https://ci.chromium.org into the subframe.
(5) Navigate that tab to https://www.chromium.org.

What is the expected result?

The second chromium.org tab should be in a different process than the first, since the tabs are in different BrowsingInstances.

What happens instead?

Both chromium.org tabs are in the same process.

This happens because the subframe in step 4 aggressively looks for an existing same-site process to use, which causes future visits to that site in the same BrowsingInstance to also use that process (even in the main frame).

This bug is similar to  issue 803366  but is not fixed in Chrome 64.  It's the more general case (not involving the remote NTP or signing into Google).  It's unclear if it will be as common, but it's still a concern.

We should consider whether there are options for how to avoid this undesirable sharing.
 
Some initial thoughts on fixing this.

If the main frame in step (5) needs to go into a new process while staying in the existing BrowsingInstance for the second tab, it presumably needs a new https://chromium.org SiteInstance, which has to be different from the subframe's https://chromium.org SiteInstance.  That will break the longstanding assumption that for a given site, there's only one SiteInstance in a BrowsingInstance: there could now be two chromium.org SiteInstances for an arbitrarily long time if step (5) is slow to commit, and we can't blow away the subframe earlier because step (5) might not commit at all.  We've actually been thinking about breaking this assumption anyway for other reasons, such as SafeBrowsing isolation from issue 776870 or supporting changes to site isolation policies at runtime.  But regardless, this seems like a very non-trivial change.

Alternatively, we could start swapping BrowsingInstances on all browser-initiated main frame navigations.  That would be a cleaner fix, and something we've discussed doing in the past, but it's also a large risky change to the process model and might have other fallout that we don't know about.  I can give it a try and see what breaks, so we're better informed about what this option entails.

I can't think of a simpler workaround that we'd be able to get into M65 before branch cut.  Any suggestions/thoughts?

Comment 2 by creis@chromium.org, Jan 18 2018

Cc: ajwong@chromium.org
Yes, I was just thinking about the plan to switch BrowsingInstances on browser-initiated main frame navigations as well (possibly starting with the case that there aren't other tabs in the process).  That seems worth exploring for M66.  I agree that there probably aren't any easy fixes for this issue for M65.

Two other things we might consider as well:

First, ajwong@ pointed out that we could try to do some better load balancing by looking for memory or CPU clues that a process shouldn't be reused.  This would help avoid putting OOPIFs and subsequent main frames in the same tab into heavy processes (e.g., Gmail).  ajwong@ had a few ideas for metrics we might consider, which hopefully he can post here.

Second, there's a chance we might be able to avoid this process sharing when it's due to the SiteInstance being kept alive only by a NavigationEntry (even without BrowsingInstance swaps).  In general, we can't change the process of a SiteInstance when it has RenderFrameHosts, because that would invalidate their routing IDs and grants in ChildProcessSecurityPolicy.  When the SiteInstance has no active frames, though, it might be possible to forget the RenderProcessHost it was associated with and select a new one when you come back.  I haven't investigated this to make sure it's possible, but it's worth looking at.  (Note that this would help in the case that you go from A(B) to C to B, but it probably wouldn't help in the case that you go from A(B) to B.)

Comment 3 by ajwong@chromium.org, Jan 18 2018

Yes... if think of the backing process as effectively the Browser process equivalent of a CPU core or a server behind a load-balance, then the process assignment process is sorta equiv load balancing with permanent affinity.

In that world, you could try to model each process with a "health" or "load" metric as opposed to current blind method which I'm guessing is effectively round-robin...

This doesn't directly address the specific unexpected process sharing problem, but it might solve the class of load-distribution problems that can be introduced.

Some simple-to-write metrics that could be used to estimate process "health"

(1) PrivateMemoryFootprint sampled every X mins:

https://cs.chromium.org/chromium/src/services/resource_coordinator/public/interfaces/memory_instrumentation/memory_instrumentation.mojom?q=PrivateMemoryFootprint&sq=package:chromium&dr=C&l=112

This is the full memory footprint usage of the process.


(2) Number Of IPCs/sec over last X secs
If you add a short stats counter into the main IPC channel for the process, you can proxy measure how "chatty" the process is which would be a proxy measurement for CPU load.


(3) "healthcheck IPC" latency over last X seconds
Instrument a "healthcheck" IPC (I think one of these already exists?). Browser sends ping to be serviced by each of the key message loops in the renderer. Renderer just responds with an ack. Browser side measures time between send/response and keeps running sample-set + average.  Let you measure responsiveness. Yet another proxy for CPU.


Tempting metrics that I may have pitfalls:

(a) Process CPU usage over last X seconds
The semantics here are clear, it may not be cheaply measurable on each platform.

(b) Resident memory/Working Set per process
This was the basis of the various Memory.Total2 metrics. It unfortunately has the behavior of possibly shrinking as system load, or if the system decides to penalize an out-of-control renderer. While PrivateMemoryFootprint may over-estimate total, it's a more stable metric.

(c) Page-fault rate per process.
With memory compression, this becomes harder to understand. Also, it might not be easily measurable on all platforms.




Yes, having hints for when a process shouldn't be reused would be very handy. Lukasz, Nasko, and I were just chatting about something similar - another (potentially shorter-term) idea is to tweak the subframe consolidation logic so it won't try to reuse processes for OOPIFs so aggresively.  E.g., if there's just one tab with google.com, a second tab with a google.com subframe can afford to create a new process for it, but it should start reusing existing processes as you get closer to process limit, or if there's more CPU/memory pressure.  That would help avoid the NTP-like case of potentially converging on process-per-site behavior for frequently-used and frequently-embedded sites.

I think we're all agreeing that swapping BIs on browser-initiated main frame navigations is the right fix and desirable in general, so I'll start exploring it as well.
Blocking: 787576
FWIW, I think that forcing a BrowsingInstance swap on a browser-initiated, cross-site navigation would go a long way toward avoiding breaking tests impacted by SiteInstance reuse needed in BrowsingInstance::GetSiteInstanceForURL for fixing isolated(b(c),d(c)) scenario in issue 787576.
Status: Started (was: Available)
I'm working on a fix for this by forcing a BrowsingInstance swap on (at least some) browser-initiated, cross-site navigations.
Project Member

Comment 7 by bugdroid1@chromium.org, Feb 8 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a308c9bcd4b54858c8404a1a681d81f1450d92f1

commit a308c9bcd4b54858c8404a1a681d81f1450d92f1
Author: Alex Moshchuk <alexmos@chromium.org>
Date: Thu Feb 08 20:58:14 2018

Allow file URLs with same path but different hash to be treated as same-site.

Bug:  803367 
Change-Id: I7fa3dc54b51c9f8f90c9c9ff772ca5606e227444
Reviewed-on: https://chromium-review.googlesource.com/900309
Reviewed-by: Charlie Reis <creis@chromium.org>
Commit-Queue: Alex Moshchuk <alexmos@chromium.org>
Cr-Commit-Position: refs/heads/master@{#535507}
[modify] https://crrev.com/a308c9bcd4b54858c8404a1a681d81f1450d92f1/content/browser/site_instance_impl.cc
[modify] https://crrev.com/a308c9bcd4b54858c8404a1a681d81f1450d92f1/content/browser/site_instance_impl_unittest.cc

Project Member

Comment 8 by bugdroid1@chromium.org, Mar 3 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7e26ecaa672c0786362996119f0528f3f07bdb58

commit 7e26ecaa672c0786362996119f0528f3f07bdb58
Author: Alex Moshchuk <alexmos@chromium.org>
Date: Sat Mar 03 01:34:29 2018

Swap BrowsingInstances on certain browser-initiated navigations.

After this CL, certain cross-site, main frame, browser-initiated
navigations will use a new BrowsingInstance.  This includes
navigations initiated from the address bar (typing in the URL or
searching) and bookmarks.

Note that this works even if the navigated frame has an opener, since
these user-initiated navigations indicate that it's ok to break
the opener relationship.

This change helps avoid unneeded process sharing.  For example, after
being on a A-embed-B page, any future browser-initiated navigation to
B would've stayed in the same BrowsingInstance, SiteInstance, and
process as the B subframe on the old page, and if the subframe reused
another tab's process due OOPIF process reuse policy, this resulted in
both tabs unnecessarily ending up in that process.

Bug:  803367 
Change-Id: I3a2461a67cb626329c20a887f993d4c9a1494236
Reviewed-on: https://chromium-review.googlesource.com/881836
Commit-Queue: Alex Moshchuk <alexmos@chromium.org>
Reviewed-by: Charlie Reis <creis@chromium.org>
Reviewed-by: Devlin <rdevlin.cronin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#540709}
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/chrome/browser/extensions/app_process_apitest.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/frame_host/render_frame_host_manager.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/frame_host/render_frame_host_manager.h
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/frame_host/render_frame_host_manager_browsertest.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/frame_host/render_frame_host_manager_unittest.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/isolated_origin_browsertest.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/site_per_process_browsertest.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/top_document_isolation_browsertest.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/browser/web_contents/web_contents_impl_unittest.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/public/test/content_browser_test_utils.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/public/test/content_browser_test_utils.h
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/public/test/test_frame_navigation_observer.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/public/test/test_frame_navigation_observer.h
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/shell/browser/layout_test/blink_test_controller.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/shell/browser/shell.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/shell/browser/shell.h
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/test/content_browser_test_utils_internal.cc
[modify] https://crrev.com/7e26ecaa672c0786362996119f0528f3f07bdb58/content/test/content_browser_test_utils_internal.h

Status: Fixed (was: Started)
I verified that this is fixed in Mac canary 67.0.3362.0, which includes the fix in c#8.  In the repro, the second tab now ends up in a separate process when it's navigated to chromium.org.

Sign in to add a comment