New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 858875 link

Starred by 5 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug

Blocking:
issue 850746



Sign in to add a comment

AMP page and site-isolation do not play well together

Project Member Reported by klo...@chromium.org, Jun 28 2018

Issue description

Load https://www.google.com/amp/s/mobile.nytimes.com/2018/06/28/sports/world-cup/england-vs-belgium.amp.html in Chrome Dev (69.0.3464.0) on Android Pixel 2. Scroll all the way down and up.

Even #enable-site-per-process says "disabled", I do get over 10 processes for this site.

Scrolling gets really sluggish.

Turn screen off and on, I got sad tab with amp frame still visible.

The device is under great memory pressure.

    699,931K: Foreground
        238,728K: com.chrome.dev:privileged_process0 (pid 16855)
        127,272K: com.google.android.apps.nexuslauncher (pid 7044 / activities)
        122,963K: com.android.settings (pid 15952 / activities)
        113,741K: com.chrome.dev (pid 16795 / activities)
         72,436K: com.google.android.gms.persistent (pid 18190)
         24,791K: com.chrome.dev:sandboxed_process33 (pid 19686)
    615,383K: Visible
        108,579K: com.google.android.gms (pid 3172)
         82,062K: com.chrome.dev:sandboxed_process4 (pid 25486)
         60,410K: com.chrome.dev:sandboxed_process20 (pid 25548)
         53,034K: com.google.android.tts (pid 13657)
         35,609K: com.chrome.dev:sandboxed_process13 (pid 25225)
         30,520K: com.chrome.dev:sandboxed_process22 (pid 25600)
         27,993K: com.chrome.dev:sandboxed_process12 (pid 25239)
         24,266K: com.chrome.dev:sandboxed_process2 (pid 26061)
         23,811K: com.google.android.wearable.app:background (pid 3138)
         23,790K: com.google.android.googlequicksearchbox:interactor (pid 26533)
         20,041K: com.chrome.dev:sandboxed_process23 (pid 25683)
         19,288K: com.chrome.dev:sandboxed_process14 (pid 25277)
         17,679K: com.google.android.as (pid 20714)
         17,107K: com.chrome.dev:sandboxed_process25 (pid 25700)
         16,449K: com.chrome.dev:sandboxed_process15 (pid 25939)
         12,986K: com.chrome.dev:sandboxed_process0 (pid 16828)

 

Comment 1 by klo...@chromium.org, Jun 29 2018

Description: Show this description

Comment 2 by creis@chromium.org, Jun 29 2018

> Even #enable-site-per-process says "disabled", I do get over 10 processes for this site.

Can you clarify whether this behavior is with Site Isolation enabled, disabled, or both?  The summary suggests it's enabled, but this line suggests it's disabled.

Note that you can check what mode you're running using chrome://process-internals (on M68 and later).

Comment 3 by klo...@chromium.org, Jun 29 2018

As a reference, Chrome stable doesn't have site isolation. And memory stats like this,

  571,335K: Foreground
        163,738K: com.android.chrome:sandboxed_process0 (pid 6310)
        158,274K: com.android.chrome:privileged_process0 (pid 6340)
        112,733K: com.android.settings (pid 15952 / activities)
         78,584K: com.android.chrome (pid 6279 / activities)
         58,006K: com.google.android.gms.persistent (pid 18190)

Comment 4 by klo...@chromium.org, Jun 29 2018

Re #2, it is on as you can see how many processes created for a single page.

Also confirmed in chrome://process-internals, it says "site per process".

But in chrome://flags, under "Strict site isolation", it says "Disabled". BTW, I didn't change it. I just observe it.

Comment 5 by klo...@chromium.org, Jun 29 2018

Just noticed that there is "Site isolation trial opt-out". And opt out really disabled site isolation. So this may be a chrome flag update bug.

Comment 6 by boliu@chromium.org, Jun 29 2018

Yeah the flags are a bit confusing. chrome://process-internals is the source of truth. #enable-site-per-process is the manual option. But googlers are under a finch experiment (not yet on stable) which can also enable site isolation, which is what I assume you are seeing.

Side note, woh that page is running terribly. I don't think the sluggishness is entirely due to memory pressure. Taking a trace, it looks like the majority of processes are actually producing frames, which is surprising, considering most of the iframes are offscreen most of the time. The sluggishness is just from pure IPC overhead of talking to 10+ processes overwhelming UI and IO threads. Probably worth looking into what's going on

Comment 7 by boliu@chromium.org, Jun 29 2018

Cc: fsam...@chromium.org
+fsamuel on performance on that page with site isolation. Is the IPC overhead expected? Can that be improved

A few other things for me to check (or if anyone already knows):
* what's the compositor memory policy of OOPIFs?
* should we consider throttling/pausing begin frames when iframes are outside the viewport?
Cc: piman@chromium.org danakj@chromium.org enne@chromium.org
This is expected. Currently, we'll keep pumping frames even if the OOPIF is outside the viewport. We can improve this situation, though. Enne has been working with the video in surfaces folks to deal with a similar situation. The solution there as far as I can tell is to stop requesting BeginFrames. The downside there is that'll cause a stutter if we scroll a paused iframe into view.

Alternatively, we can allocate a new SurfaceId when it comes into view, which will block the scroll until some deadline or the OOPIF produces some frame. 

Both approaches have their downsides. The latter can impact scroll performance but the deadline is adjustable. The former should not impact top document scroll but will cause occasional OOPIF stutter.

Comment 9 by boliu@chromium.org, Jun 29 2018

Here's a trace (ignore the gigantic gaps when it's taking a memory infra dump) on pixel 1. One thing to note though, the entire article is an OOPIF as well. I don't know if that matters to any of the questions below.

There is a lot of FrameHostMsg_RouteMessageEvent on the UI thread, ie routing post messages between OOPIFs. Guess can't really move it off to another thread, but at ~0.2ms a piece on the UI thread, that seems way too expensive. Probably worth investigating more.

I don't see as many FrameHostMsg_SynchronizeVisualProperties as remember there was yesterday (I think there were more animating ads yesterday). But I'm kind of curious why would scroll require these updates at all, except from the frame that's scrolling?

Total PSS is 500MB, which on a device with 4GB of memory shouldn't be a huge problem (while chrome is foreground, and with only that tab..)


For trade offs in #8, OOPIF stutter seems to me like the lesser of two evils. Elsewhere, I'm already considering letting android aggressively kill offscreen iframes for memory, so stutter isn't the worse thing :p

Comment 10 by boliu@chromium.org, Jun 29 2018

I thought I attached the trace..
trace_858875.json.gz
5.8 MB Download

Comment 11 by piman@chromium.org, Jun 29 2018

Cc: kenrb@chromium.org
We could consider a skewed viewport skirt for OOPIFs so that if we're scrolling and an OOPIF (or video frame or w/e) is expected to come into view "soon" we could start preemptively pumping BeginFrames to avoid stutter/checkerboarding.

We already have something similar to constrain the memory of OOPIFs by taking into account the embedder's viewport.
+1 to a skew viewport! We currently do that for raster on the main thread IIRC. I think this code needs to live on the compositor thread for both raster and BeginFrame. WDYT?

Also, my recollection is RouteMessageEvent is used for postMessaging viewability. I believe Intersection Observer v2 is meant to address some of this? 

Comment 13 by kenrb@chromium.org, Jun 29 2018

#8: We shouldn't be pumping frames for OOPIFs that are out of the viewport, because they are render throttled when their viewport intersection rect is empty.

#11: That has been discussed before. I had thought there was a already bug somewhere for it but that might be wrong since I can't find any right now. A skewed viewport during scroll would enable the OOPIF to render early, and also create a runway for large OOPIFs to keep ahead of the current viewport extent.

As Fady says, that likely depends on the intersection calculation being moved to the compositor thread because within Blink it is difficult to know when a scroll is in progress.
Blocking: 850746
Who will be the owner for this one?
I think the Viz team should own this but I need to speak some more to gklassen@
Cc: skyos...@chromium.org
fwiw, moving display compositor to GPU does help quite a bit with smoothness here, since that moves the a lot of the compositing work off of UI thread so it's not as affected by UI or browser IO thread being busy. It's "Viz Display Compositor (OOP-D)" in chrome://flags


Also I hacked together a CL to merge UpdateViewportIntersection, SetIsInert, SetInheritedEffectiveTouchAction, and UpdateRenderThrottlingStatus into a single IPC: https://chromium-review.googlesource.com/c/chromium/src/+/1128394

On Android, per-IPC overhead pretty high, and in the trace, those 4 always happen together. So merging them reduces cpu and wall clock time of FrameHostMsg_SynchronizeVisualProperties by quite a bit; probably frees up IO thread a bit too, though not sure how to measure that. fsamuel/kenrb: do you think it's worthwhile doing this?

Site note: is there any documentation on how these 'visual properties' and 'screen rects' concepts work? I want to learn about them for another oopif bug :)
FYI, we're looking into enabling priority-based scheduling on I/O (and later UI) threads ( bug 861708 ) to reduce high latency like this caused by long queueing times. Of course batching and generally limiting the number of IPCs also helps a lot.
@boliu: "VisualProperties" is currently in a state of flux (lots of refactoring for OOPIF performance reasons). Is there anything specific you want to know? For what it's worth, some of this working may already be planned:

From our Viz planning (go/viz-wat-planning)/demo(go/viz-wat-demo)

- Reduce platform-specific code: Move more things to RenderWidgetHostImpl and out of RenderWidgetHostView. This includes removing the notion of LocalSurfaceId from aura and the window service.
- Unify FrameVisualProperties and VisualProperties
- Direct client-to-client channel across renderers for surface synchronization.

The "other oopif bug" is https://bugs.chromium.org/p/chromium/issues/detail?id=855037#c3. Also I find myself only having a very vague idea of the problem all these messages are solving when I was looking at the trace. Anyway, probably should move this discussion to that bug.

Should I clean up and land my CL as a small incremental step, or just abandon it?
Cc: alex...@chromium.org
Just a quick update here.  We need AMP pages to behave well to be able to launch site isolation on Android, but I was curious how much this is still issue if we only isolate a subset of sites.  Unfortunately, the original link no longer leads to an AMP page, so I tried to check another random AMP page from another bug, https://www.google.com/amp/www.foxnews.com/lifestyle/2018/09/05/new-zealand-town-proposes-ban-on-cats-with-pest-plan-this-really-isnt-place-for-cats.amp.html, for issues.  I see google.com hosting all of the content in a ampproject.org iframe, which further embeds and another iframe from ampproject.net and a couple of ad iframes at googlesyndication.com (which themselves have a couple more cross-site iframes).  I hope that's a fairly common setup for AMP.  

With full site isolation, we use six processes and the scrolling is weird/sluggish.  Trying the same out with the current canary trial of isolating a subset of important sites, we use two processes: only the top google.com frame needs isolation, and all the iframes are grouped into just one process.  All of the content still lives in an ampproject.org OOPIF, but just anecdotally, the scrolling feels a lot more responsive.  So, hopefully, this issue won't be as high of a priority for selective site isolation.
A quick update on #19 too: we found that OOP-D[1] resolves most of the input jank that is currently happening because of high IO thread congestion. Since that work is already happening, we shelved the plan for an IO thread scheduler for now. If you want to measure the impact here, try --enable-features=VizDisplayCompositor.

[1] https://docs.google.com/presentation/d/1PfaIDZ5oJTEuAEJR8aj-B9QC-r1Pht_jQXwjifM1jQI/edit#slide=id.p
skyostil@: thanks for the update.  What's the status/timeline for shipping OOP-D for Android?
Cc: vmi...@chromium.org
+vmiura@, I'll defer to you on #24 :)
We just got approval to launch OOP-D in M70. 
Labels: Proj-SiteIsolationAndroid-BlockingLaunch
Just to follow up here - is OOP-D now enabled by default for Android?

Adding a label to indicate that this is important for shipping site isolation on Android.  There may not be actual work remaining here though - we should first confirm if OOP-D combined with the less strict form of site isolation (#c22) resolves this.

Sign in to add a comment