OOPIF: Hangouts doesn't reload after subframe crash |
|||||
Issue descriptionChrome Version: 56.0.2924.59, 57.0.2979.0 OS: Windows, Mac, Linux, Chrome What steps will reproduce the problem? (1) Start Chrome in --isolate-extensions mode (or in a SiteIsolationExtensions/Enabled field trial). (2) Install the Google Hangouts extension (https://chrome.google.com/webstore/detail/google-hangouts/nckgahadagoaajjgafhacjanaoiihapd) (3) Open the Hangouts window and sign in. (4) Using Chrome's Task Manager, kill the subframe process for the Hangouts google.com OOPIF. You should see a sad frame in the Hangouts window. (5) Close and reopen Hangouts, or reload the window. What is the expected result? The Hangouts iframes should reload and work properly. What happens instead? The Hangouts window shows the app icon (splash screen) and doesn't load the chat roster iframe, and a new tab opens to about:blank. This seems to continue with repeated attempts to reopen the Hangouts window, whether you close the about:blank tab or not. It goes away if you disable and re-enable the extension. Alternatively, Hangouts will reload itself after 30 seconds or so, and it will start working again if you've closed the about:blank tab by then. Could be similar to issue 568357 ?
,
Jan 18 2017
We've considered a few ways to avoid creating the new about:blank tab, but none of them really help Hangouts recover gracefully, and they all add some ugly complexity: 1) Reloading all frames in a recreated process is a bad choice, since it will likely lead to a crash loop in many cases (e.g., OOM). 2) Creating an empty RenderFrame for each frame in a recreated process isn't great, since it would likely interfere with the sad frame logic and might be confusing to scripts (since the frame starts at about:blank). 3) Creating a RenderFrameProxy might be one way to have a placeholder for named frames that haven't been created yet, but this gets very confusing in the browser process, where we already have a non-live RenderFrameHost and may now also need a RenderFrameProxyHost. This will likely interfere with swapping logic. 4) Keep a list of known frame names in a process that haven't had frames recreated yet, and somehow return null from window.open rather than opening a new tab. That's an unexpected return value from window.open, though. None of those really help with getting Hangouts to reload its background page-- they would just avoid creating an extra about:blank tab. It seems like a better option would be getting the background page to recover if it loses an OOPIF. Options: 5) Automatically reload an extension's background page if one of its OOPIFs crash. This wouldn't help the general window.open issue, but it would avoid it in the Hangouts case, and the background page is indeed a different case than most tabs, where users can see the page and reload it manually. However, the auto-reload might cause a crash loop in some cases, so we'd have to consider throttling it. 6) Show the "This extension has crashed" bubble if the extension's background page has an OOPIF crash. This lets the user manually recover, just as they would today if the whole extension crashes. This is probably the safest thing for Chrome to do, without requiring extension developers to know about the new partial failure mode.
,
Jan 18 2017
Devlin, what do you think about option 6 in comment #2? Should we show the "extension has crashed" bubble if an extension's background page loses an OOPIF to a crash, and then reload the extension if so? That causes a bit of fate sharing between the extension and any subframe processes it depends on, but it does give the user a nice way to recover from this case that the extension is stuck because of a broken subframe in a background page.
,
Jan 20 2017
It's a little weird to say "Extension xyz has crashed" when that extension is still running... and then to reload it, we first unload it? I'd almost feel better just killing the extension if one of the frame dies to make the messaging and implementation more direct. But that might be a little harsh... Out of curiosity, do we really want to optimize for this case? I can't think of another time when we try to proactively help the user seamlessly recover from a crashed renderer; do we want extensions to be different? This doesn't really strike me as that much different than a tab experiencing a subframe crashing (since there's no guarantee that a subframe in a tab is visible to the user), and so I'm hesitant to treat them radically differently.
,
Jan 20 2017
I'm pretty sure we don't want to kill the extension process if one of its subframes crashes, since that could get really bad with --top-document-isolation. (Any subframe crash in any tab could then take down a bunch of your extensions.) The reason I think it's worth being proactive here is that you end up in kind of a stuck state when this happens. Normally closing and re-opening something like the Hangouts panel is sufficient to bring it back, but that doesn't work if the background page is missing its OOPIF. (Hangouts does seem to have some logic that reloads the background page after a long timeout, but it doesn't seem to work very well if this about:blank tab has opened.) It's not obvious to the user that they have to disable and re-enable the extension to get it working again. Maybe we can expect extension authors to detect this case and reload on their own, but it seems unlikely that they'll think about it. They haven't had to worry about subframes going away before. What do you think about the auto-reload idea (e.g., if we only do it once)? Is there precedent for that, or is it too risky to get various extension pages out of sync with each other?
,
Jan 20 2017
We don't really ever just reload a single extension page, and it's not something that's easy. Persistent background pages never expect to go away suddenly, and could lose state, and even event pages and the like could be in the middle of messaging, in which case we wouldn't normally shut them down. I don't think autoreloading necessarily makes the situation any better since it causes its own can of worms, so I'd be inclined to say the right solution here is: - We do our best to never crash frames. :) - Extension authors that are very paranoid about it can do some of their own detection and handle it or reload themselves gracefully.
,
Jan 20 2017
I personally like the suggestion by nick@ to use the existing notification for a crashed extension and let the user reload it if they chose to. It is the equivalent of when a subframe crashes in a tab - the user will have the choice to hit the refresh button. If the user choses to reload, we can just do the equivalent of disable/enable of the extension and it should behave in a way we already support and extensions should already be handling correctly.
,
Jan 20 2017
Comment 6: Yes, we can try to wait and see if it's an issue. This was mainly an attempt to be proactive and find a more graceful way to recover.
Maybe we can revisit if it proves to be a problem for users. At that point, we could consider something like a variation of the crashed extension bubble saying something along the lines of "A subframe of extension XYZ has crashed. Click this balloon to reload the {extension, subframe}."
Should we set this aside for now?
,
Jan 21 2017
@7, my concern with showing the bubble is that it's not really correct. The extension didn't crash - it's still running. It's active and in the enabled set of the registry, and won't have the same UI treatment as other terminated extensions in other places (like the chrome://extensions page). Additionally, since the extension is still running, we don't really know how vital that frame was, and if it's in the background page, the user has no inclination. If the extension isn't expecting to be terminated, it could cause more harm than good. With a tab, the user can make a judgment call (full tab crash? Reload. Random ad iframe? Probably not). I'm fine setting this aside for now - I'm reluctant to add assistive code that might end up being detrimental, and (hopefully) crashed frames are rare.
,
Jan 23 2017
Ok. For the record, any of the following workarounds will help for the time being: 1) Disable/re-enable the extension. (This gets it back into a clean state.) 2) Close the about:blank tab and wait 30 seconds for Hangouts to refresh itself. (Usually works, not always.) 3) Restart Chrome. (Overkill, but it works.) |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by creis@chromium.org
, Jan 18 2017Ok, this one is a bit tricky. I'll explain what we know so far about the about:blank behavior, and then follow up with thoughts about how to deal with the bug. The Hangouts extension has a background page with a web OOPIF, and it also has a visible panel page with a web OOPIF. The two web OOPIFs script each other, and the latter finds the former using window.open("", "gtn-roster-iframe-id"). When the subframe process crashes, both OOPIFs go away. If you close and reopen Hangouts (which users are likely to do if they see a sad frame), we reload the visible panel page and its web OOPIF. That tries to find the background page's OOPIF by name ("gtn-roster-iframe-id") with window.open, but no such RenderFrame or RenderFrameProxy exists in the newly recreated subframe process. When window.open can't find an existing window, it opens a new tab to about:blank, which explains the behavior described above. This is possible with --site-per-process mode as well, independent of extensions. Repro steps: 0) Start Chrome in --site-per-process. 1) Visit http://csreis.github.io/tests/window-open.html 2) Add a named cross-site iframe in DevTools: f = document.createElement("iframe"); f.name = "foo"; f.src = "https://csreis.github.io"; document.body.appendChild(f); 3) Click "Open simple window" 4) In the new tab, add a named cross-site iframe in DevTools, using the same code as above but naming it "bar" instead of "foo." 5) Open a DevTools instance for the OOPIF in the new tab and run "w = window.open('', 'foo');". This returns an existing window reference to the OOPIF in the first tab and doesn't create a new tab. 6) Kill the OOPIFs' process in Chrome's Task Manager. 7) In the new tab, reload the OOPIF: "f.src = 'https://csreis.github.io';" 8) Repeat step 5, while the OOPIF in the first tab is still a sad frame. We expect the window.open to return an existing window reference, but instead it opens a new tab to about:blank (since that RenderFrame hasn't been recreated and thus no name is found). At the same time, the about:blank tab behavior is possible without any OOPIFs at all, since window.open is fundamentally a bad API that opens a new window if it can't find the existing one. Just closing the desired tab before the window.open call will cause it. You can also see it in Hangouts without OOPIFs (though this one is unlikely in practice): (1) Start Chrome in default mode. (2) Install the Google Hangouts extension (https://chrome.google.com/webstore/detail/google-hangouts/nckgahadagoaajjgafhacjanaoiihapd) (3) Open the Hangouts window and sign in. (4) Go to chrome://inspect and click "Inspect" under "Google Hangouts." (5) Find the "gtn-roster-iframe-id" in DevTools and navigate it to about:blank. (6) Close and reopen Hangouts, or reload the window. As before, you'll get an about:blank tab instead of a functional Hangouts panel. Ultimately, I think it's hard to do much about the about:blank tab behavior from window.open, and we should focus on how to help Hangouts gracefully recover from this situation. We basically need to get the background page to refresh itself, and making the user disable and re-enable the extension is an awkward workaround. More thoughts on this in the next comment.