Missing surfaces after gpucrash in viz software compositing |
|||||||
Issue description1. Open new tab page 2. Open another tab. Go to chrome://gpucrash 3. See this in console, until tab switching back to new tab page: [198031:198041:0329/135444.787682:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135444.787727:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 6), LocalSurfaceId(3, 1, 916C...)) [198031:198041:0329/135444.787923:ERROR:surface_aggregator.cc(552)] SurfaceId(FrameSinkId(4, 6), LocalSurfaceId(3, 1, 916C...)) [DelegatedFrameHost] is missing during aggregation [198031:198041:0329/135444.939119:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135445.440846:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135445.941839:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135446.442739:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135446.943780:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135447.380760:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135447.429107:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [198031:198041:0329/135447.429448:ERROR:surface_aggregator.cc(552)] SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [DelegatedFrameHost] is missing during aggregation 4. See pink flash when switching to the new tab page before it fills in.
,
Jun 11 2018
I took a quick look at this. After the GPU process crashes, the browser submits a CompositorFrame and tries to embed the renderer surface. The renderer surface doesn't exist yet and adding a reference fails. The display scheduler stops scheduling here because there is no damage to the browser surface. The renderer then submits a CompositorFrame and creates the surface. For some reason, probably because of the missing reference, the renderer surface being created doesn't trigger the display scheduler to schedule a DrawAndSwap() and nothing shows up until something else causes a DrawAndSwap().
,
Jun 12 2018
Unlike GPU compositing, in software compositing mode DelegatedFrameHost::OnLostResources is not called and the fallback is not cleared, hence the huge number of missing surface errors. As for why the DrawAndSwap isn't called, it's probably the same bug as crbug.com/843290 .
,
Jul 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/19309b092ded5f8c1188f549f5490052bbc29484 commit 19309b092ded5f8c1188f549f5490052bbc29484 Author: kylechar <kylechar@chromium.org> Date: Thu Jul 19 16:49:22 2018 Fix OOP-D handling of GPU crash. This CL fixes how the browser renderer host code handles a GPU crash on Windows with OOP-D. We were clearing both the fallback and primary surface ID in DelegatedFrameHost on GPU crash. The SurfaceLayer primary SurfaceId wasn't set until it went through the embed flow again, which didn't happen consistently after a GPU crash unless the user was interacting with the browser UI. The result is an background color frame until the user does something to trigger the embed flow. The browser still wants to embed the same renderer SurfaceId, so only clear the fallback SurfaceId and wait for the renderer to submit a new CompositorFrame. With OOP-D we always want to call OnLostResources() for GPU process crashes. For GPU compositing this is handled by losing the main thread context provider. With software compositing there is no main thread context provider. Add a call to OnLostResources() to handle the software compositing case. In a future CL we should split OnLostResources() into OnLostSharedContext() and OnLostGpuProcess() to differentiate between the two reasons resources can be lost. ContextObserver::OnLostResources() is only called if there is a shared main thread context provider that gets lost. Without OOP-D, that context provider only exists for Chrome OS and even then only sometimes. We don't use shared main thread context on other platforms, so OnLostResources() hasn't been reliably called in a while. This could be fixed as a follow up, or we could just wait for OOP-D to launch everywhere. It's not crucial to clear the fallback SurfaceId without OOP-D as the Surface still exists, there just might be GL resource errors. Bug: 827242 Change-Id: I2f203fa827225b1607d8b5436375818625b01cad Reviewed-on: https://chromium-review.googlesource.com/1143138 Commit-Queue: kylechar <kylechar@chromium.org> Reviewed-by: Fady Samuel <fsamuel@chromium.org> Cr-Commit-Position: refs/heads/master@{#576527} [modify] https://crrev.com/19309b092ded5f8c1188f549f5490052bbc29484/content/browser/compositor/viz_process_transport_factory.cc [modify] https://crrev.com/19309b092ded5f8c1188f549f5490052bbc29484/content/browser/renderer_host/delegated_frame_host.cc
,
Jul 26
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b99b10433127b949822c475c9621c3940eb21ca6 commit b99b10433127b949822c475c9621c3940eb21ca6 Author: kylechar <kylechar@chromium.org> Date: Thu Jul 26 21:09:32 2018 Split up ContextFactoryObserver::OnLostResources(). OnLostResources() was being used in two different ways. First, it was used when the shared main thread context provider was lost. Second, it was used to tell when the GPU process was lost (with OOP-D only). This CL splits this into two different functions, as the actions required in both cases are different. OnLostVizProcess() is only needed for OOP-D where we want to let DelegatedFrameHost know the renderer surface has been destroyed. The only functional change here is that sometimes DelegatedFrameHost::OnLostResources() would get called without OOP-D on Chrome OS. This is because Chrome OS is the only platform that uses the shared main thread context without OOP-D. Even on Chrome OS shared main thread context didn't always exist because it's created on demand for exo/fastink. Not doing anything in OnLostSharedContext() shouldn't be a problem, if we had to evict the renderer surface on GPU crashes it would cause problems on all other platforms that use DelegatedFrameHost where OnLostResources() was never called. Bug: 827242 Change-Id: Ideebba3c712a2e1ebaf3c852d9aa7721cb1d40d3 Reviewed-on: https://chromium-review.googlesource.com/1150456 Reviewed-by: Daniele Castagna <dcastagna@chromium.org> Reviewed-by: Fady Samuel <fsamuel@chromium.org> Reviewed-by: danakj <danakj@chromium.org> Commit-Queue: kylechar <kylechar@chromium.org> Cr-Commit-Position: refs/heads/master@{#578439} [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/buffer.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/buffer_unittest.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/surface_tree_host.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/surface_tree_host.h [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/gpu_process_transport_factory.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/image_transport_factory_browsertest.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/owned_mailbox.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/owned_mailbox.h [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/test/test_image_transport_factory.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/viz_process_transport_factory.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/renderer_host/delegated_frame_host.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/renderer_host/delegated_frame_host.h [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/compositor.h [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/test/context_factories_for_test.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/test/in_process_context_factory.cc [modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/test/in_process_context_factory.h
,
Jul 27
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/757662261f21dc1f44f9258355bcec3271bc5a17 commit 757662261f21dc1f44f9258355bcec3271bc5a17 Author: Saman Sami <samans@chromium.org> Date: Fri Jul 27 14:08:13 2018 Fix missing OOPIFs after GPU crash Always submit a CompositorFrame on OnFirstSurfaceActivation even if the fallback doesn't change. Bug: 827242 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: I560b5251a188eae2a83db384367e2201ffcce83b Reviewed-on: https://chromium-review.googlesource.com/1152209 Commit-Queue: Saman Sami <samans@chromium.org> Reviewed-by: kylechar <kylechar@chromium.org> Reviewed-by: Fady Samuel <fsamuel@chromium.org> Cr-Commit-Position: refs/heads/master@{#578624} [modify] https://crrev.com/757662261f21dc1f44f9258355bcec3271bc5a17/cc/layers/surface_layer.cc [modify] https://crrev.com/757662261f21dc1f44f9258355bcec3271bc5a17/content/renderer/child_frame_compositing_helper.cc
,
Jul 30
We'd like to merge #5 and #6 back into M69 for OOP-D launch. Thanks!
,
Jul 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea commit a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea Author: Saman Sami <samans@chromium.org> Date: Mon Jul 30 19:13:14 2018 Fix nullptr access in SurfaceLayer::SetFallbackSurfaceId Bug: 868828, 827242 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: I13199db914192ab81bbe6fd34b086b58fa275fa4 Reviewed-on: https://chromium-review.googlesource.com/1155005 Reviewed-by: Fady Samuel <fsamuel@chromium.org> Commit-Queue: Saman Sami <samans@chromium.org> Cr-Commit-Position: refs/heads/master@{#579114} [modify] https://crrev.com/a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea/cc/layers/surface_layer.cc
,
Jul 30
#8 also needs to be merged back.
,
Jul 31
Your change meets the bar and is auto-approved for M69. Please go ahead and merge the CL to branch 3497 manually. Please contact milestone owner if you have questions. Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 31
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e commit bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e Author: kylechar <kylechar@chromium.org> Date: Tue Jul 31 15:56:37 2018 Split up ContextFactoryObserver::OnLostResources(). OnLostResources() was being used in two different ways. First, it was used when the shared main thread context provider was lost. Second, it was used to tell when the GPU process was lost (with OOP-D only). This CL splits this into two different functions, as the actions required in both cases are different. OnLostVizProcess() is only needed for OOP-D where we want to let DelegatedFrameHost know the renderer surface has been destroyed. The only functional change here is that sometimes DelegatedFrameHost::OnLostResources() would get called without OOP-D on Chrome OS. This is because Chrome OS is the only platform that uses the shared main thread context without OOP-D. Even on Chrome OS shared main thread context didn't always exist because it's created on demand for exo/fastink. Not doing anything in OnLostSharedContext() shouldn't be a problem, if we had to evict the renderer surface on GPU crashes it would cause problems on all other platforms that use DelegatedFrameHost where OnLostResources() was never called. Bug: 827242 Change-Id: Ideebba3c712a2e1ebaf3c852d9aa7721cb1d40d3 Reviewed-on: https://chromium-review.googlesource.com/1150456 Reviewed-by: Daniele Castagna <dcastagna@chromium.org> Reviewed-by: Fady Samuel <fsamuel@chromium.org> Reviewed-by: danakj <danakj@chromium.org> Commit-Queue: kylechar <kylechar@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#578439}(cherry picked from commit b99b10433127b949822c475c9621c3940eb21ca6) Reviewed-on: https://chromium-review.googlesource.com/1156874 Reviewed-by: Saman Sami <samans@chromium.org> Cr-Commit-Position: refs/branch-heads/3497@{#273} Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753} [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/buffer.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/buffer_unittest.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/surface_tree_host.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/surface_tree_host.h [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/gpu_process_transport_factory.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/image_transport_factory_browsertest.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/owned_mailbox.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/owned_mailbox.h [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/test/test_image_transport_factory.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/viz_process_transport_factory.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/renderer_host/delegated_frame_host.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/renderer_host/delegated_frame_host.h [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/compositor.h [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/test/context_factories_for_test.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/test/in_process_context_factory.cc [modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/test/in_process_context_factory.h
,
Jul 31
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ef3028fd9dc2efdedfa1fa863f3eefe64d23d670 commit ef3028fd9dc2efdedfa1fa863f3eefe64d23d670 Author: Saman Sami <samans@chromium.org> Date: Tue Jul 31 16:02:39 2018 Fix missing OOPIFs after GPU crash Always submit a CompositorFrame on OnFirstSurfaceActivation even if the fallback doesn't change. TBR=samans@chromium.org (cherry picked from commit 757662261f21dc1f44f9258355bcec3271bc5a17) Bug: 827242 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: I560b5251a188eae2a83db384367e2201ffcce83b Reviewed-on: https://chromium-review.googlesource.com/1152209 Commit-Queue: Saman Sami <samans@chromium.org> Reviewed-by: kylechar <kylechar@chromium.org> Reviewed-by: Fady Samuel <fsamuel@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#578624} Reviewed-on: https://chromium-review.googlesource.com/1156846 Reviewed-by: Saman Sami <samans@chromium.org> Cr-Commit-Position: refs/branch-heads/3497@{#275} Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753} [modify] https://crrev.com/ef3028fd9dc2efdedfa1fa863f3eefe64d23d670/cc/layers/surface_layer.cc [modify] https://crrev.com/ef3028fd9dc2efdedfa1fa863f3eefe64d23d670/content/renderer/child_frame_compositing_helper.cc
,
Jul 31
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/da8abf8b8fcf515488f0eb94ab7d68ca6a7e594a commit da8abf8b8fcf515488f0eb94ab7d68ca6a7e594a Author: Saman Sami <samans@chromium.org> Date: Tue Jul 31 16:03:59 2018 Fix nullptr access in SurfaceLayer::SetFallbackSurfaceId Bug: 868828, 827242 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: I13199db914192ab81bbe6fd34b086b58fa275fa4 Reviewed-on: https://chromium-review.googlesource.com/1155005 Reviewed-by: Fady Samuel <fsamuel@chromium.org> Commit-Queue: Saman Sami <samans@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#579114}(cherry picked from commit a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea) Reviewed-on: https://chromium-review.googlesource.com/1156924 Reviewed-by: Saman Sami <samans@chromium.org> Cr-Commit-Position: refs/branch-heads/3497@{#276} Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753} [modify] https://crrev.com/da8abf8b8fcf515488f0eb94ab7d68ca6a7e594a/cc/layers/surface_layer.cc
,
Jul 31
,
Jul 31
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by danakj@chromium.org
, Mar 29 2018