New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 827242 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 31
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug

Blocking:
issue 730660



Sign in to add a comment

Missing surfaces after gpucrash in viz software compositing

Project Member Reported by danakj@chromium.org, Mar 29 2018

Issue description

1. Open new tab page
2. Open another tab. Go to chrome://gpucrash
3. See this in console, until tab switching back to new tab page:

[198031:198041:0329/135444.787682:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135444.787727:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 6), LocalSurfaceId(3, 1, 916C...))
[198031:198041:0329/135444.787923:ERROR:surface_aggregator.cc(552)] SurfaceId(FrameSinkId(4, 6), LocalSurfaceId(3, 1, 916C...)) [DelegatedFrameHost] is missing during aggregation
[198031:198041:0329/135444.939119:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135445.440846:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135445.941839:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135446.442739:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135446.943780:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135447.380760:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135447.429107:ERROR:surface_manager.cc(326)] No surface in map for SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...))
[198031:198041:0329/135447.429448:ERROR:surface_aggregator.cc(552)] SurfaceId(FrameSinkId(4, 2), LocalSurfaceId(3, 1, B728...)) [DelegatedFrameHost] is missing during aggregation

4. See pink flash when switching to the new tab page before it fills in.
 

Comment 1 by danakj@chromium.org, Mar 29 2018

Blocking: 730660
Cc: samans@chromium.org
Labels: -Pri-3 Pri-1
I took a quick look at this. After the GPU process crashes, the browser submits a CompositorFrame and tries to embed the renderer surface. The renderer surface doesn't exist yet and adding a reference fails. The display scheduler stops scheduling here because there is no damage to the browser surface. The renderer then submits a CompositorFrame and creates the surface. For some reason, probably because of the missing reference, the renderer surface being created doesn't trigger the display scheduler to schedule a DrawAndSwap() and nothing shows up until something else causes a DrawAndSwap().

Comment 3 by samans@chromium.org, Jun 12 2018

Unlike GPU compositing, in software compositing mode DelegatedFrameHost::OnLostResources is not called and the fallback is not cleared, hence the huge number of missing surface errors. As for why the DrawAndSwap isn't called, it's probably the same bug as  crbug.com/843290 .
Project Member

Comment 4 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/19309b092ded5f8c1188f549f5490052bbc29484

commit 19309b092ded5f8c1188f549f5490052bbc29484
Author: kylechar <kylechar@chromium.org>
Date: Thu Jul 19 16:49:22 2018

Fix OOP-D handling of GPU crash.

This CL fixes how the browser renderer host code handles a GPU crash on
Windows with OOP-D. We were clearing both the fallback and primary
surface ID in DelegatedFrameHost on GPU crash. The SurfaceLayer primary
SurfaceId wasn't set until it went through the embed flow again, which
didn't happen consistently after a GPU crash unless the user was
interacting with the browser UI. The result is an background color frame
until the user does something to trigger the embed flow. The browser
still wants to embed the same renderer SurfaceId, so only clear the
fallback SurfaceId and wait for the renderer to submit a new
CompositorFrame.

With OOP-D we always want to call OnLostResources() for GPU process
crashes. For GPU compositing this is handled by losing the main thread
context provider. With software compositing there is no main thread
context provider. Add a call to OnLostResources() to handle the software
compositing case. In a future CL we should split OnLostResources() into
OnLostSharedContext() and OnLostGpuProcess() to differentiate between
the two reasons resources can be lost.

ContextObserver::OnLostResources() is only called if there is a shared
main thread context provider that gets lost. Without OOP-D, that context
provider only exists for Chrome OS and even then only sometimes. We
don't use shared main thread context on other platforms, so
OnLostResources() hasn't been reliably called in a while. This could be
fixed as a follow up, or we could just wait for OOP-D to launch
everywhere. It's not crucial to clear the fallback SurfaceId without
OOP-D as the Surface still exists, there just might be GL resource
errors.

Bug:  827242 
Change-Id: I2f203fa827225b1607d8b5436375818625b01cad
Reviewed-on: https://chromium-review.googlesource.com/1143138
Commit-Queue: kylechar <kylechar@chromium.org>
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Cr-Commit-Position: refs/heads/master@{#576527}
[modify] https://crrev.com/19309b092ded5f8c1188f549f5490052bbc29484/content/browser/compositor/viz_process_transport_factory.cc
[modify] https://crrev.com/19309b092ded5f8c1188f549f5490052bbc29484/content/browser/renderer_host/delegated_frame_host.cc

Project Member

Comment 5 by bugdroid1@chromium.org, Jul 26

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b99b10433127b949822c475c9621c3940eb21ca6

commit b99b10433127b949822c475c9621c3940eb21ca6
Author: kylechar <kylechar@chromium.org>
Date: Thu Jul 26 21:09:32 2018

Split up ContextFactoryObserver::OnLostResources().

OnLostResources() was being used in two different ways. First, it was
used when the shared main thread context provider was lost. Second, it
was used to tell when the GPU process was lost (with OOP-D only). This
CL splits this into two different functions, as the actions required in
both cases are different.

OnLostVizProcess() is only needed for OOP-D where we want to let
DelegatedFrameHost know the renderer surface has been destroyed.

The only functional change here is that sometimes
DelegatedFrameHost::OnLostResources() would get called without OOP-D on
Chrome OS. This is because Chrome OS is the only platform that uses the
shared main thread context without OOP-D. Even on Chrome OS shared main
thread context didn't always exist because it's created on demand for
exo/fastink. Not doing anything in OnLostSharedContext() shouldn't be a
problem, if we had to evict the renderer surface on GPU crashes it would
cause problems on all other platforms that use DelegatedFrameHost where
OnLostResources() was never called.

Bug:  827242 
Change-Id: Ideebba3c712a2e1ebaf3c852d9aa7721cb1d40d3
Reviewed-on: https://chromium-review.googlesource.com/1150456
Reviewed-by: Daniele Castagna <dcastagna@chromium.org>
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Reviewed-by: danakj <danakj@chromium.org>
Commit-Queue: kylechar <kylechar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#578439}
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/buffer.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/buffer_unittest.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/surface_tree_host.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/components/exo/surface_tree_host.h
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/gpu_process_transport_factory.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/image_transport_factory_browsertest.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/owned_mailbox.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/owned_mailbox.h
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/test/test_image_transport_factory.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/compositor/viz_process_transport_factory.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/renderer_host/delegated_frame_host.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/content/browser/renderer_host/delegated_frame_host.h
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/compositor.h
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/test/context_factories_for_test.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/test/in_process_context_factory.cc
[modify] https://crrev.com/b99b10433127b949822c475c9621c3940eb21ca6/ui/compositor/test/in_process_context_factory.h

Project Member

Comment 6 by bugdroid1@chromium.org, Jul 27

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/757662261f21dc1f44f9258355bcec3271bc5a17

commit 757662261f21dc1f44f9258355bcec3271bc5a17
Author: Saman Sami <samans@chromium.org>
Date: Fri Jul 27 14:08:13 2018

Fix missing OOPIFs after GPU crash

Always submit a CompositorFrame on OnFirstSurfaceActivation even if
the fallback doesn't change.

Bug:  827242 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I560b5251a188eae2a83db384367e2201ffcce83b
Reviewed-on: https://chromium-review.googlesource.com/1152209
Commit-Queue: Saman Sami <samans@chromium.org>
Reviewed-by: kylechar <kylechar@chromium.org>
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Cr-Commit-Position: refs/heads/master@{#578624}
[modify] https://crrev.com/757662261f21dc1f44f9258355bcec3271bc5a17/cc/layers/surface_layer.cc
[modify] https://crrev.com/757662261f21dc1f44f9258355bcec3271bc5a17/content/renderer/child_frame_compositing_helper.cc

Components: Internals>Compositing
Labels: Merge-Request-69
We'd like to merge #5 and #6 back into M69 for OOP-D launch. Thanks!
Project Member

Comment 8 by bugdroid1@chromium.org, Jul 30

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea

commit a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea
Author: Saman Sami <samans@chromium.org>
Date: Mon Jul 30 19:13:14 2018

Fix nullptr access in SurfaceLayer::SetFallbackSurfaceId

Bug: 868828, 827242 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I13199db914192ab81bbe6fd34b086b58fa275fa4
Reviewed-on: https://chromium-review.googlesource.com/1155005
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Commit-Queue: Saman Sami <samans@chromium.org>
Cr-Commit-Position: refs/heads/master@{#579114}
[modify] https://crrev.com/a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea/cc/layers/surface_layer.cc

#8 also needs to be merged back.
Project Member

Comment 10 by sheriffbot@chromium.org, Jul 31

Labels: -Merge-Request-69 Hotlist-Merge-Approved Merge-Approved-69
Your change meets the bar and is auto-approved for M69. Please go ahead and merge the CL to branch 3497 manually. Please contact milestone owner if you have questions.
Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 11 by bugdroid1@chromium.org, Jul 31

Labels: -merge-approved-69 merge-merged-3497
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e

commit bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e
Author: kylechar <kylechar@chromium.org>
Date: Tue Jul 31 15:56:37 2018

Split up ContextFactoryObserver::OnLostResources().

OnLostResources() was being used in two different ways. First, it was
used when the shared main thread context provider was lost. Second, it
was used to tell when the GPU process was lost (with OOP-D only). This
CL splits this into two different functions, as the actions required in
both cases are different.

OnLostVizProcess() is only needed for OOP-D where we want to let
DelegatedFrameHost know the renderer surface has been destroyed.

The only functional change here is that sometimes
DelegatedFrameHost::OnLostResources() would get called without OOP-D on
Chrome OS. This is because Chrome OS is the only platform that uses the
shared main thread context without OOP-D. Even on Chrome OS shared main
thread context didn't always exist because it's created on demand for
exo/fastink. Not doing anything in OnLostSharedContext() shouldn't be a
problem, if we had to evict the renderer surface on GPU crashes it would
cause problems on all other platforms that use DelegatedFrameHost where
OnLostResources() was never called.

Bug:  827242 
Change-Id: Ideebba3c712a2e1ebaf3c852d9aa7721cb1d40d3
Reviewed-on: https://chromium-review.googlesource.com/1150456
Reviewed-by: Daniele Castagna <dcastagna@chromium.org>
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Reviewed-by: danakj <danakj@chromium.org>
Commit-Queue: kylechar <kylechar@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#578439}(cherry picked from commit b99b10433127b949822c475c9621c3940eb21ca6)
Reviewed-on: https://chromium-review.googlesource.com/1156874
Reviewed-by: Saman Sami <samans@chromium.org>
Cr-Commit-Position: refs/branch-heads/3497@{#273}
Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753}
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/buffer.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/buffer_unittest.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/surface_tree_host.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/components/exo/surface_tree_host.h
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/gpu_process_transport_factory.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/image_transport_factory_browsertest.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/owned_mailbox.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/owned_mailbox.h
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/test/test_image_transport_factory.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/compositor/viz_process_transport_factory.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/renderer_host/delegated_frame_host.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/content/browser/renderer_host/delegated_frame_host.h
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/compositor.h
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/test/context_factories_for_test.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/test/in_process_context_factory.cc
[modify] https://crrev.com/bdcd14b7a1cf0d05c57ad7ea3a625d9e129de14e/ui/compositor/test/in_process_context_factory.h

Project Member

Comment 12 by bugdroid1@chromium.org, Jul 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ef3028fd9dc2efdedfa1fa863f3eefe64d23d670

commit ef3028fd9dc2efdedfa1fa863f3eefe64d23d670
Author: Saman Sami <samans@chromium.org>
Date: Tue Jul 31 16:02:39 2018

Fix missing OOPIFs after GPU crash

Always submit a CompositorFrame on OnFirstSurfaceActivation even if
the fallback doesn't change.

TBR=samans@chromium.org

(cherry picked from commit 757662261f21dc1f44f9258355bcec3271bc5a17)

Bug:  827242 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I560b5251a188eae2a83db384367e2201ffcce83b
Reviewed-on: https://chromium-review.googlesource.com/1152209
Commit-Queue: Saman Sami <samans@chromium.org>
Reviewed-by: kylechar <kylechar@chromium.org>
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#578624}
Reviewed-on: https://chromium-review.googlesource.com/1156846
Reviewed-by: Saman Sami <samans@chromium.org>
Cr-Commit-Position: refs/branch-heads/3497@{#275}
Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753}
[modify] https://crrev.com/ef3028fd9dc2efdedfa1fa863f3eefe64d23d670/cc/layers/surface_layer.cc
[modify] https://crrev.com/ef3028fd9dc2efdedfa1fa863f3eefe64d23d670/content/renderer/child_frame_compositing_helper.cc

Project Member

Comment 13 by bugdroid1@chromium.org, Jul 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/da8abf8b8fcf515488f0eb94ab7d68ca6a7e594a

commit da8abf8b8fcf515488f0eb94ab7d68ca6a7e594a
Author: Saman Sami <samans@chromium.org>
Date: Tue Jul 31 16:03:59 2018

Fix nullptr access in SurfaceLayer::SetFallbackSurfaceId

Bug: 868828, 827242 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I13199db914192ab81bbe6fd34b086b58fa275fa4
Reviewed-on: https://chromium-review.googlesource.com/1155005
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Commit-Queue: Saman Sami <samans@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#579114}(cherry picked from commit a1e9bc8bbbc83c785a3bfa4b5db1396475cd26ea)
Reviewed-on: https://chromium-review.googlesource.com/1156924
Reviewed-by: Saman Sami <samans@chromium.org>
Cr-Commit-Position: refs/branch-heads/3497@{#276}
Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753}
[modify] https://crrev.com/da8abf8b8fcf515488f0eb94ab7d68ca6a7e594a/cc/layers/surface_layer.cc

Status: Fixed (was: Available)
Owner: samans@chromium.org

Sign in to add a comment