New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 852204 link

Starred by 6 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-06-18
OS: Linux , Android , Windows , Chrome , Mac
Pri: 1
Type: Bug

Blocking:
issue 851882



Sign in to add a comment

Delay subframe process shutdown when the subframe has an unload handler

Project Member Reported by alex...@chromium.org, Jun 13 2018

Issue description

Filing this to keep track of a short-term fix for termination pings from OOPIF unload handlers.  

Currently, if a page with an out-of-process iframe navigates cross-process, we immediately destroy the old subframe RenderFrameHost at commit time (see FrameTreeNode::ResetForNewProcess()).  The RFH destruction triggers a message to the renderer to delete the RenderFrame, which also causes it to run its unload handler.  However, if the subframe was the last active frame in its process, the process will proceed to shut down without waiting for the unload handler to run at all, and in such a case, the subframe unload handler won't really be able to do anything, including termination pings.

The most comprehensive fix for this is to move the list of child frames from FrameTreeNode to RenderFrameHost, which means that a pending delete RenderFrameHost will keep its children FTNs and RFHs alive.  This will be tracked in issue 609963.

In the short term though, we can fix much of the problem by simply delaying subframe process shutdown in this case (say, by one second to match the existing unload timeout).  This should give termination pings one second to execute, which would make them as reliable as in the same-process subframe case.  Although this would still shut down the subframe RFH immediately, that won't matter for things like navigator.sendBeacon(), which aren't routed via RFH.

More discussion in this doc: https://docs.google.com/document/d/1XiNu0iAnYpxyObsdwStxTNYq83BoPW0lwjG5xbb1l88/edit?usp=sharing
 
Status: Started (was: Assigned)
I have a CL in progress at https://chromium-review.googlesource.com/c/chromium/src/+/1094917.

Comment 2 by creis@chromium.org, Jun 14 2018

Cc: creis@chromium.org
Labels: M-67 M-68 M-69 Target-68
Thanks Alex!  This is something we'll want to target for at least M68 (if not also M67 in a hypothetical respin).  Updating labels.  Hopefully we can get it landed to day to verify tomorrow and bake over the weekend before merge to M68?
Yes, I think it should be ok for M68 and maybe even M67 with sufficient bake time.  I do plan to land the CL today.
Project Member

Comment 4 by bugdroid1@chromium.org, Jun 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e

commit 1bfc151cd3805a81e6488dd7b4cd65fd08f3584e
Author: Alex Moshchuk <alexmos@chromium.org>
Date: Thu Jun 14 20:49:51 2018

Delay subframe process shutdown to allow subframe unload handlers to execute.

Currently, if a page with an OOPIF navigates cross-process, we
immediately destroy the OOPIF's RenderFrameHost when its parent
commits (see FrameTreeNode::ResetForNewProcess()).  The RFH
destruction triggers a message to the renderer to delete the
RenderFrame, which also causes it to run its unload handler.  However,
if the subframe was the last active frame in its process, the process
will proceed to shut down without waiting for the unload handler to
run at all, and in such a case, the subframe unload handler won't
really be able to do anything, including termination pings.

This CL fixes this by delaying subframe process shutdown in such
situations by one second (which matches the existing unload timeout
that we use for pending delete RenderFrameHosts).

Bug:  852204 , 609963
Change-Id: I142ef4e6f59df465c6ed213df02daf492e989c2d
Reviewed-on: https://chromium-review.googlesource.com/1094917
Reviewed-by: Steven Holte <holte@chromium.org>
Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org>
Reviewed-by: Charlie Reis <creis@chromium.org>
Commit-Queue: Alex Moshchuk <alexmos@chromium.org>
Cr-Commit-Position: refs/heads/master@{#567395}
[modify] https://crrev.com/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e/content/browser/frame_host/render_frame_host_impl.cc
[modify] https://crrev.com/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e/content/browser/frame_host/render_frame_host_impl.h
[modify] https://crrev.com/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e/content/browser/frame_host/render_frame_host_manager_browsertest.cc
[modify] https://crrev.com/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e/content/browser/renderer_host/render_process_host_impl.cc
[modify] https://crrev.com/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e/content/browser/renderer_host/render_process_host_impl.h
[modify] https://crrev.com/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e/content/public/browser/render_process_host.h
[modify] https://crrev.com/1bfc151cd3805a81e6488dd7b4cd65fd08f3584e/tools/metrics/histograms/histograms.xml

Comment 5 by creis@chromium.org, Jun 15 2018

Blocking: 851882
Verified on Mac Canary 69.0.3461.2, which includes r567395, using the following steps:

1. With strict site isolation, go to http://alexmos17.github.io/tests/unload.html

2. Click on "sendBeacon" in the subframe.

3. Run "python -m SimpleHTTPServer 8000" from a terminal

4. Go back to Chrome and navigate the tab away to http://csreis.github.io/

5. Observe that the sendBeacon request shows up in the python HTTP server output.  In an older canary, the request didn't show up.

I'll let it bake for a bit longer and then request a merge to M68.
NextAction: 2018-06-18
The NextAction date has arrived: 2018-06-18
Labels: Merge-Request-68
Requesting merge to M68.  This has been verified on canary (see #6) and has baked there for three days.  I also checked crash reports for 69.0.3461.2+, and nothing obvious related to this change jumped out.  This change fixes an important site isolation issue (broken termination pings from subframe unload handlers) by keeping subframe renderer processes alive for an extra second before shutdown, in cases where unload handlers need to run.
Project Member

Comment 10 by sheriffbot@chromium.org, Jun 18 2018

Labels: -Merge-Request-68 Hotlist-Merge-Review Merge-Review-68
This bug requires manual review: M68 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), kariahda@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Review-68 Merge-Approved-68
Thanks for the fix! Approving merge for M68. Branch:3440
Cc: abdulsyed@chromium.org
Pls merge you change to M68 branch 3440 ASAP so we can pick it up for this week Beta release. Merge has to happen latest by 1:00 PM PT tomorrow, Tuesday (06/19), so we can pick it up for Wednesday Beta release.




Project Member

Comment 13 by bugdroid1@chromium.org, Jun 18 2018

Labels: -merge-approved-68 merge-merged-3440
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/16e6577a8fa81b76b044373c5116f73fc89d09f6

commit 16e6577a8fa81b76b044373c5116f73fc89d09f6
Author: Alex Moshchuk <alexmos@chromium.org>
Date: Mon Jun 18 21:31:26 2018

Delay subframe process shutdown to allow subframe unload handlers to execute. (Merge to M68)

Currently, if a page with an OOPIF navigates cross-process, we
immediately destroy the OOPIF's RenderFrameHost when its parent
commits (see FrameTreeNode::ResetForNewProcess()).  The RFH
destruction triggers a message to the renderer to delete the
RenderFrame, which also causes it to run its unload handler.  However,
if the subframe was the last active frame in its process, the process
will proceed to shut down without waiting for the unload handler to
run at all, and in such a case, the subframe unload handler won't
really be able to do anything, including termination pings.

This CL fixes this by delaying subframe process shutdown in such
situations by one second (which matches the existing unload timeout
that we use for pending delete RenderFrameHosts).

TBR=alexmos@chromium.org

(cherry picked from commit 1bfc151cd3805a81e6488dd7b4cd65fd08f3584e)

Bug:  852204 , 609963
Change-Id: I142ef4e6f59df465c6ed213df02daf492e989c2d
Reviewed-on: https://chromium-review.googlesource.com/1094917
Reviewed-by: Steven Holte <holte@chromium.org>
Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org>
Reviewed-by: Charlie Reis <creis@chromium.org>
Commit-Queue: Alex Moshchuk <alexmos@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#567395}
Reviewed-on: https://chromium-review.googlesource.com/1105179
Reviewed-by: Alex Moshchuk <alexmos@chromium.org>
Cr-Commit-Position: refs/branch-heads/3440@{#422}
Cr-Branched-From: 010ddcfda246975d194964ccf20038ebbdec6084-refs/heads/master@{#561733}
[modify] https://crrev.com/16e6577a8fa81b76b044373c5116f73fc89d09f6/content/browser/frame_host/render_frame_host_impl.cc
[modify] https://crrev.com/16e6577a8fa81b76b044373c5116f73fc89d09f6/content/browser/frame_host/render_frame_host_impl.h
[modify] https://crrev.com/16e6577a8fa81b76b044373c5116f73fc89d09f6/content/browser/frame_host/render_frame_host_manager_browsertest.cc
[modify] https://crrev.com/16e6577a8fa81b76b044373c5116f73fc89d09f6/content/browser/renderer_host/render_process_host_impl.cc
[modify] https://crrev.com/16e6577a8fa81b76b044373c5116f73fc89d09f6/content/browser/renderer_host/render_process_host_impl.h
[modify] https://crrev.com/16e6577a8fa81b76b044373c5116f73fc89d09f6/content/public/browser/render_process_host.h
[modify] https://crrev.com/16e6577a8fa81b76b044373c5116f73fc89d09f6/tools/metrics/histograms/histograms.xml

Status: Fixed (was: Started)
Labels: Needs-Feedback
Tested the issue on chrome version# 69.0.3451.0(build without fix) with steps mentioned in comment#6:
1) Launched chrome reported version, enabled Strict site isolation and navigated to http://alexmos17.github.io/tests/unload.html
2) Clicked on "sendBeacon" in the subframe, ran "python -m SimpleHTTPServer 8000" from terminal
3) Went back to Chrome and navigate to URL: http://csreis.github.io/, observed "csreis.github.io" in both omnibar and chrome page

@Reporter: Please find the attached screencast for your reference and let us know if we missed anything in verifying the fix and please help us in confirming the fix.

Thanks!
852204.mp4
4.3 MB View Download
#15: the last step in the repro (navigating to csreis.github.io) is not quite correct.  Instead of opening it in a new tab, you need to navigate the tab with http://alexmos17.github.io/tests/unload.html (where the unload handlers were installed) away to http://csreis.github.io, for example by pasting it into the omnibox.
Labels: -Needs-Feedback TE-Verified-68.0.3440.33
alexmos@, thank you for the update. Verified the above fix on Win, Mac & Linux platforms for today's M68 Beta RC# 68.0.3440.33 and it's working as intended.

Comment 18 by creis@chromium.org, Jun 21 2018

Cc: gov...@chromium.org ty...@tylerbutler.com
Adding govind@ and commentor from  issue 853021 , as we start to discuss whether to merge this to M67.

Comment 19 by creis@chromium.org, Jun 21 2018

Labels: Target-67
Alex is currently verifying this fix in a local M67 branch to make sure it's safe to merge.  If that goes well, we intend to request merge to M67 given discussion with govind@ and on https://bugs.chromium.org/p/chromium/issues/detail?id=810843#c41.

The unload fix should help resolve the problems reported in  issue 853021  and  issue 854973 .  We're hoping to get verification for the Microsoft Office Online case in  issue 853021  to confirm that the fix will address the most pressing issue there, and will be sufficient for M67.
Labels: Merge-Request-67
I've verified the fix in a local M67 branch.  The merge was straightforward other than a trivial conflict in tests, the unload tests I added are passing, and I also ran Chrome and verified repro steps in #6.  So, requesting merge to M67.  

Note that this fix does not fix the newly-reported unload problems with embedded Google Docs in issue 855280.  We'll investigate potential fixes and/or workarounds for that issue separately.
Labels: -Merge-Request-67 Merge-Approved-67
Approving merge to M67 branch 3396 based on comment #19 and #20. Pls merge. Thank you.

Also issue 855280 is not a blocker for next week M67 respin, correct?
Thanks, will merge shortly.  #21: for issue 855280, we're assuming it's not a blocker and hoping that we can find a workaround from the Docs side, but we need to understand more about what's going on there.  We'll keep you posted.
Ok, thank you alexmos@.
Project Member

Comment 24 by bugdroid1@chromium.org, Jun 22 2018

Labels: -merge-approved-67 merge-merged-3396
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf

commit 92b72c7ca498e61a6e7afacb300b0246a0d5b9cf
Author: Alex Moshchuk <alexmos@chromium.org>
Date: Fri Jun 22 00:30:25 2018

Delay subframe process shutdown to allow subframe unload handlers to execute. (Merge to M67)

Currently, if a page with an OOPIF navigates cross-process, we
immediately destroy the OOPIF's RenderFrameHost when its parent
commits (see FrameTreeNode::ResetForNewProcess()).  The RFH
destruction triggers a message to the renderer to delete the
RenderFrame, which also causes it to run its unload handler.  However,
if the subframe was the last active frame in its process, the process
will proceed to shut down without waiting for the unload handler to
run at all, and in such a case, the subframe unload handler won't
really be able to do anything, including termination pings.

This CL fixes this by delaying subframe process shutdown in such
situations by one second (which matches the existing unload timeout
that we use for pending delete RenderFrameHosts).

TBR=alexmos@chromium.org

(cherry picked from commit 1bfc151cd3805a81e6488dd7b4cd65fd08f3584e)

Bug:  852204 , 609963
Change-Id: I142ef4e6f59df465c6ed213df02daf492e989c2d
Reviewed-on: https://chromium-review.googlesource.com/1094917
Reviewed-by: Steven Holte <holte@chromium.org>
Reviewed-by: Łukasz Anforowicz <lukasza@chromium.org>
Reviewed-by: Charlie Reis <creis@chromium.org>
Commit-Queue: Alex Moshchuk <alexmos@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#567395}
Reviewed-on: https://chromium-review.googlesource.com/1111331
Reviewed-by: Alex Moshchuk <alexmos@chromium.org>
Cr-Commit-Position: refs/branch-heads/3396@{#789}
Cr-Branched-From: 9ef2aa869bc7bc0c089e255d698cca6e47d6b038-refs/heads/master@{#550428}
[modify] https://crrev.com/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf/content/browser/frame_host/render_frame_host_impl.cc
[modify] https://crrev.com/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf/content/browser/frame_host/render_frame_host_impl.h
[modify] https://crrev.com/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf/content/browser/frame_host/render_frame_host_manager_browsertest.cc
[modify] https://crrev.com/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf/content/browser/renderer_host/render_process_host_impl.cc
[modify] https://crrev.com/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf/content/browser/renderer_host/render_process_host_impl.h
[modify] https://crrev.com/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf/content/public/browser/render_process_host.h
[modify] https://crrev.com/92b72c7ca498e61a6e7afacb300b0246a0d5b9cf/tools/metrics/histograms/histograms.xml

Comment 19: All the Microsoft Office Online testing using the 68 beta with this fix has looked good, and seems to address the biggest problems from our side. We still care about the OnBeforeUnload bug, but this one is the most pressing.

I can't promise we won't find something else, but things look good at this point.

Comment 26 by luizp@google.com, Jun 22 2018

Labels: Hotlist-Partner-GSuite
I've verified the fix in M67 stable RC 67.0.3396.99 that has the merge from #24 (on Mac).  I used steps in #6, and both img pings and sendBeacon are working from unload in an OOPIF on my test page.
Labels: Needs-Feedback
Tested the issue on chrome version# 69.0.3396.99 with steps mentioned in comment#6 & 16:
1) Launched chrome reported version, enabled Strict site isolation and navigated to http://alexmos17.github.io/tests/unload.html
2) Clicked on "sendBeacon" in the subframe, ran "python -m SimpleHTTPServer 8000" from terminal
3) Went back to Chrome and entered URL: http://csreis.github.io/ in the same omnibar where http://alexmos17.github.io/tests/unload.html is navigated
4) In terminal observed:
127.0.0.1 - - [25/Jun/2018 18:01:45] code 501, message Unsupported method ('POST')
127.0.0.1 - - [25/Jun/2018 18:01:45] "POST //foo HTTP/1.1" 501 -

Note: Seen same behaviour on Mac 10.12.6, Ubuntu 14.04 and Windows 10

@Reporter: Please find the attached screencast for your reference and help us in confirming the fix.

Thanks!
852204.mp4
3.7 MB View Download

Comment 29 by creis@chromium.org, Jun 25 2018

Update: Chrome 67.0.3396.99 is now available on the stable channel with the fix.  You can trigger an update manually by visiting chrome://chrome.
#28: that is the correct behavior.  The POST request to /foo shows up in the HTTP server's log, whereas before the fix, nothing would show up.
Cc: alex...@chromium.org
 Issue 854973  has been merged into this issue.

Sign in to add a comment