RendererCompositorFrameSink should not outlive DelegatedFrameHost? |
|||||||||||||||||
Issue descriptionSaman noticed some interesting behavior while moving where surface ID allocation happens. DNSError_GoBack2AndForward and other similar navigation tests fail because the RendererCompositorFrameSink survives the navigation and navigation back whereas the RenderWidgetHostViewAura / DelegatedFrameHost get destroyed. From Saman: The crash happens on GoForwardAndWaitForNavigations(2); in the test. RenderFrameHostImpl::OnSwappedOut is called after FrameTree::CreateRenderViewHost, and so in FrameTree::CreateRenderViewHost the RenderViewHost is still alive and instead of creating a new view we return the existing one. Thus, we end up reusing the RendererCompositorFrameSink. Currently the RendererCompositorFrameSink is lost only on gpu context lost. Perhaps LTH should lose the CompositorFrameSink on swapout too? My worry is we don't end up returning resources to renderers that get reused during navigation, and so we end up leaking gpu resources over time.
,
Mar 15 2017
In particular, I don't think DelegatedFrameHost will return resources on destruction.
,
Mar 15 2017
Yes, that does sound like a bug. lfg@ would probably know whether we can discard the renderer-side RWHV-related state, since I think he worked on discarding the RWHV in the browser process in these cases. (He's OOO at the moment, but I'll list him as an owner to help triage it when he gets back. Anyone should feel free to grab it in the meantime, though.)
,
Mar 15 2017
RenderWidget and RenderWidgetHostView go hand-in-hand, I don't think we can reuse one if we don't reuse the other.
,
Mar 17 2017
Here is what seems to be happening in the unit test. I'm not entirely sure but this is the best I can come up with. Imagine we navigate to another page and immediately navigate back. On the first navigation, the DelegatedFrameHost is destroyed and the RenderFrame is asked to swap out, but in RenderFrameImpl::OnSwapOut the RenderWidget is not destroyed. Instead, we call RenderWidget::WasSwappedOut which simply tells the RenderProcess that it’s OK to shut down. The RenderProcess sends a ChildProcessHostMsg_ShutdownRequest, and in the meantime we have navigated back to the first page, so the browser still wants to use the renderer and rejects the shutdown request (See RenderProcessHostImpl::OnShutdownRequest). So we end up using the same RenderWidget object, which still has its old RendererCompositorFrameSink.
,
Mar 28 2017
There are much simpler scenarios that lead to RenderWidget reuse, see https://bugs.chromium.org/p/chromium/issues/detail?id=705548. The bug is quite easy to reproduce in real-world usage.
,
Mar 28 2017
lfg@: I think this CL that you landed could have caused the problem. https://codereview.chromium.org/2496233003/ I think before that only RenderWidgetHostImpl could destroy its view, but now RenderFrameHostManager can also destroy it. The problem seems to go beyond just leaking resources. We could also show an empty page, as described in the bug that ahest@ linked. RenderWidgetHostView has a complicated lifetime and I'm not 100% sure about this, so please correct me if I'm wrong.
,
Mar 28 2017
,
Mar 28 2017
I'm marking this as P1 because it also manifests as hangs on navigation
,
Mar 28 2017
,
Mar 28 2017
I looked into this issue, and it comes as a result of reusing the RenderWidget without reinitializing it properly. The proper fix is to delete the RenderWidget when a frame is swapped out and re-create it when a frame is swapped back in. Unfortunately, we can't do that yet as RenderView and RenderWidget are the same object (see issue 583347). I was under the impression that calling RenderWidgetCompositor::setVisible(false) (done here: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/web/WebViewFrameWidget.cpp?l=28) was enough to clean up the state of the compositor, but it doesn't seem to be the case. So I think we have two options: 1. Send ViewMsg_ReclaimCompositorResources when destroying the DelegatedFrameHost (which is essentially what ahest@'s CL does). 2. Clean up the state from the renderer when destroying the WebViewFrameWidget in WebViewFrameWidget::close(). I'm inclined to go with option 1 until we can have a more permanent solution (splitting RenderView/RenderWidget). Any thoughts from compositor experts?
,
Mar 28 2017
There seems to be a long tail of graphics bugs being caused here. I can only imagine what others will come up from reusing a compositor with code that assumes things are in their default states. I think you should destroy things in the renderer instead of just making them not visible.
,
Mar 28 2017
Also, this is causing renderer hangs in stable (see https://codereview.chromium.org/2779713002/). I've marked it RBB to remind us to merge to beta if we can, but consider stable too.
,
Mar 28 2017
Right, we should keep RW, RWHI and RWHV 1:1. There are fairly strong assumptions in the code that they are.
,
Mar 28 2017
@lfg: what would be the fallout of reverting that 3-month old CL?
,
Mar 28 2017
I commented on https://codereview.chromium.org/2496233003/ as well, but it's difficult to create a new RenderView because it holds a lot of not-compositing-related stuff as well: notably, it's how the blink::Page itself is kept alive. The long-term solution is to split the compositing functionality wholly out of RenderView; in the short-term, is there any sort of hack we can land to reset RV to a clean-enough slate?
,
Mar 28 2017
> In the short-term, is there any sort of hack we can land to reset RV to a clean-enough slate? c&p from the CL: I'm not sure. There's more than just the compositor, there are a bunch of things such as sizing, input events, IME, accessibility, focus, mouse capture, cursor, etc. where the RWHV is what keeps the state, not the RWHI. Can we recreate all that state? Maybe? Sounds like a lot of work! Should we? I'm not sure...
,
Mar 28 2017
FWIW, I tried reverting https://codereview.chromium.org/2496233003 (there's conflicts, but I tried to untangle), and it fixes the repro case from bug 705548 . I don't know what else it breaks, though.
,
Mar 28 2017
To answer the other question of "How did it work before? RV and RW have been the same thing forever, why is it suddenly an issue?" It became an issue when we implemented OOPIF. In this mode, the main frame can become a RemoteFrame (at which point we'd shut down the compositing bits for the main frame) and then become a LocalFrame again in the future (at which point we'd need to start those bits back up).
,
Mar 28 2017
,
Mar 28 2017
So let's go back to the root cause of bug 705548 : it's a race between sending the compositor frame ACK and swapping out the RenderView (and therefore deleting the RWHV). Now, obviously keeping the RWHV around forever will fix that bug -- since the compositor bits that send the ACK never go away, the race won't happen. It still doesn't change the fact that we are reusing the RW. When a RVH/RWHI (and RV/RW) goes into a swapped out state, it won't receive updates to sizing, input events, IME, accessibility, etc, etc, so we already need to recreate all this state when swapping the RenderView back in. It should be similar to what happens when a RenderProcess is destroyed and recreated -- the RWHV must also be recreated with the same RWHI (though a different RW). Like I mentioned before, the proper fix is to actually split RenderView/RenderWidget so that the latter can be destroyed and recreated when a RenderView becomes swapped out. There's still a lot of work to be done to get there, if you are interested in helping. Keeping the old RWHV around after swapping out can cause other bugs, for example, on Mac, the RWHV is associated with a native widget, which will stick around and can cause issues ( bug 526326 for example, where the RWHV for the proxy occludes the RWHV that's drawing the frame). That's why I mentioned I think we need to have an intermediate solution for this, which I still think it should either be by always sending the ACK (which just fixes the root cause of the issue) or having a way in the renderer to reset the compositor to a clean state.
,
Mar 29 2017
Per chat with lfg@ this at least affects Chrome OS, Windows, Mac and Linux. Re: the need to respin 57: "I believe this is a pretty rare hang, the steps involved are hard to reproduce in a real-world scenario, but it's possible."
,
Mar 29 2017
,
Mar 29 2017
Given we already have 58 beta out, should we remove the beta blocker label (perhaps ReleaseBlock-Stable and M-58 only)? It seems reasonable to put in a fix, but this should not really block a release if it is not a regression.
,
Mar 30 2017
This is a regression in M-57 right? But stable is out so we can't block that really. Idk, adjust the labels as u think is right.
,
Mar 30 2017
It's a regression in 57, but talking to release managers for it, we decided to skip fixing the bug there under the assumption that it would affect few users and the effect is medium severity. We can revisit if it ends up affecting more users than estimated and we can respin. We should fix on trunk and 58 though.
,
Apr 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/00cf9403c34aea924ffecd736f93e06421e10e15 commit 00cf9403c34aea924ffecd736f93e06421e10e15 Author: piman <piman@chromium.org> Date: Mon Apr 03 17:09:20 2017 Revert "Destroy the old RenderWidgetHostView when swapping out a main frame." It makes the assumption that a RenderWidget can talk to different RenderWidgetHostView over its lifetime, but this assumption cannot be satisfied by the code, and so this causes immediate problems (renderer hangs, tests failures and flakiness) as well as fundamental consistency problems making further changes to the code intractable. Note: this is a manual revert because of conflicts Original description: > Destroy the old RenderWidgetHostView when swapping out a main frame. > > This fixes an issue where the old RenderView is reused by a new remote > subframe. We shouldn't be leaking resources, because now we are already > hiding the unused RenderView in the renderer, when the local main frame > gets detached and the WebViewFrameWidget is closed. > > BUG= 638375 > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation > > Review-Url: https://codereview.chromium.org/2496233003 > Cr-Commit-Position: refs/heads/master@{#438516} BUG= 701988 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Review-Url: https://codereview.chromium.org/2786443003 Cr-Commit-Position: refs/heads/master@{#461459} [modify] https://crrev.com/00cf9403c34aea924ffecd736f93e06421e10e15/content/browser/frame_host/render_frame_host_impl.cc [modify] https://crrev.com/00cf9403c34aea924ffecd736f93e06421e10e15/content/browser/frame_host/render_frame_host_impl.h [modify] https://crrev.com/00cf9403c34aea924ffecd736f93e06421e10e15/content/browser/frame_host/render_frame_host_manager.cc [modify] https://crrev.com/00cf9403c34aea924ffecd736f93e06421e10e15/content/browser/site_per_process_browsertest.cc [modify] https://crrev.com/00cf9403c34aea924ffecd736f93e06421e10e15/content/browser/web_contents/web_contents_impl.cc
,
Apr 3 2017
Thanks for the revert, please ensure to verify in canary and request a merge to M58. Beta RC cut @ 4.00 PM PST -Tuesday(04/04)
,
Apr 4 2017
RC cut is today @4.00 PM PST.Please prioritize.
,
Apr 4 2017
,
Apr 4 2017
This bug requires manual review: Reverts referenced in bugdroid comments after merge request. Please contact the milestone owner if you have questions. Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 4 2017
Approved for 58 branch 3029.
,
Apr 4 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/204f85e5d72d7da5c76d6a38045304c4fb594fd0 commit 204f85e5d72d7da5c76d6a38045304c4fb594fd0 Author: Antoine Labour <piman@chromium.org> Date: Tue Apr 04 19:21:31 2017 Revert "Destroy the old RenderWidgetHostView when swapping out a main frame." It makes the assumption that a RenderWidget can talk to different RenderWidgetHostView over its lifetime, but this assumption cannot be satisfied by the code, and so this causes immediate problems (renderer hangs, tests failures and flakiness) as well as fundamental consistency problems making further changes to the code intractable. Note: this is a manual revert because of conflicts Original description: > Destroy the old RenderWidgetHostView when swapping out a main frame. > > This fixes an issue where the old RenderView is reused by a new remote > subframe. We shouldn't be leaking resources, because now we are already > hiding the unused RenderView in the renderer, when the local main frame > gets detached and the WebViewFrameWidget is closed. > > BUG= 638375 > CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation > > Review-Url: https://codereview.chromium.org/2496233003 > Cr-Commit-Position: refs/heads/master@{#438516} BUG= 701988 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation Review-Url: https://codereview.chromium.org/2786443003 Cr-Commit-Position: refs/heads/master@{#461459} (cherry picked from commit 00cf9403c34aea924ffecd736f93e06421e10e15) Review-Url: https://codereview.chromium.org/2798673002 . Cr-Commit-Position: refs/branch-heads/3029@{#575} Cr-Branched-From: 939b32ee5ba05c396eef3fd992822fcca9a2e262-refs/heads/master@{#454471} [modify] https://crrev.com/204f85e5d72d7da5c76d6a38045304c4fb594fd0/content/browser/frame_host/render_frame_host_impl.cc [modify] https://crrev.com/204f85e5d72d7da5c76d6a38045304c4fb594fd0/content/browser/frame_host/render_frame_host_impl.h [modify] https://crrev.com/204f85e5d72d7da5c76d6a38045304c4fb594fd0/content/browser/frame_host/render_frame_host_manager.cc [modify] https://crrev.com/204f85e5d72d7da5c76d6a38045304c4fb594fd0/content/browser/site_per_process_browsertest.cc [modify] https://crrev.com/204f85e5d72d7da5c76d6a38045304c4fb594fd0/content/browser/web_contents/web_contents_impl.cc
,
Apr 4 2017
Thanks piman@, let's close this issue as fixed.
,
Apr 18 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4ec47030b3d06cb198562b4647094191ba83b06e commit 4ec47030b3d06cb198562b4647094191ba83b06e Author: ahest <ahest@yandex-team.ru> Date: Tue Apr 18 19:19:53 2017 Add a test checking that compositor works in a reused renderer. The bug was happening if the renderer submitted a frame right before navigation commit. The swap ACK got lost, and if this RenderWidget was reused later, it was drawing nothing because it was waiting for the ACK. BUG= 701988 , 705548 CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel Review-Url: https://codereview.chromium.org/2779713002 Cr-Commit-Position: refs/heads/master@{#465322} [modify] https://crrev.com/4ec47030b3d06cb198562b4647094191ba83b06e/content/browser/renderer_host/render_widget_host_view_browsertest.cc [add] https://crrev.com/4ec47030b3d06cb198562b4647094191ba83b06e/content/test/data/page_with_animation.html
,
Jun 13 2017
,
Feb 26 2018
|
|||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||
Comment 1 by fsam...@chromium.org
, Mar 15 2017