New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

"gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, Aug 28

Issue description

"gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 3 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyigELEgVGbGFrZSJ_Z3B1X3Rlc3RzLmNvbnRleHRfbG9zdF9pbnRlZ3JhdGlvbl90ZXN0LkNvbnRleHRMb3N0SW50ZWdyYXRpb25UZXN0LkdwdUNyYXNoX0dQVVByb2Nlc3NDcmFzaGVzRXhhY3RseU9uY2VQZXJWaXNpdFRvQWJvdXRHcHVDcmFzaAw.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs

This flaky test/step was previously tracked in  issue 861956 .
 
Cc: bsalomon@chromium.org
Components: Internals>Services>Viz
Labels: OS-Mac
Owner: kbr@chromium.org
Status: Assigned (was: Untriaged)
There is a related fix in r576176 from  issue 861956  linked above.

This has also been reported in  Issue 863627  (merged into  issue 861956 ) and  Issue 823097 .

the latest reports are mostly on Mac.

https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/128054 is r586576

it looks like there's an intended crash coming from GpuChannelMsg_CrashForTesting, and a second coming from a DCHECK in
  ui::Compositor::DidFailToInitializeLayerTreeFrameSink()
under
  content::VizProcessTransportFactory::CreateLayerTreeFrameSink(base::WeakPtr<ui::Compositor>)

 Thread 0 (crashed)
   0  Chromium Framework!__ZN4base5debug13BreakDebuggerEv + 0x11
      rax = 0x00007f8692907a38   rdx = 0x00007f8692907a38
      rcx = 0x0000000000000015   rbx = 0x0000000000000015
      rsi = 0x000000000000025d   rdi = 0x0000000120fbe8b4
      rbp = 0x00007ffee6d52c10   rsp = 0x00007ffee6d52c10
       r8 = 0x00007f8692907a4d    r9 = 0x0000000000001bf1
      r10 = 0x00007f8692907a4d   r11 = 0x0000000118dd7880
      r12 = 0x00007f8690c00d30   r13 = 0x00007f8692907a4d
      r14 = 0x00007ffee6d53148   r15 = 0x00007ffee6d53140
      rip = 0x0000000118d95a51
      Found by: given as instruction pointer in context
   1  Chromium Framework!__ZN7logging10LogMessageD2Ev + 0x8dd
      rbp = 0x00007ffee6d53130   rsp = 0x00007ffee6d52c20
      rip = 0x0000000118c9afcd
      Found by: previous frame's frame pointer
   2  Chromium Framework!__ZN2ui10Compositor37DidFailToInitializeLayerTreeFrameSinkEv + 0x4c
      rbp = 0x00007ffee6d53270   rsp = 0x00007ffee6d53140
      rip = 0x000000011b388f8c
      Found by: previous frame's frame pointer
   3  Chromium Framework!__ZN2cc13LayerTreeHost37DidFailToInitializeLayerTreeFrameSinkEv + 0x9f
      rbp = 0x00007ffee6d533c0   rsp = 0x00007ffee6d53280
      rip = 0x000000011a76108f
      Found by: previous frame's frame pointer
   4  Chromium Framework!__ZN2cc17SingleThreadProxy21SetLayerTreeFrameSinkEPNS_18LayerTreeFrameSinkE + 0x240
      rbp = 0x00007ffee6d53530   rsp = 0x00007ffee6d533d0
      rip = 0x000000011a7c7740
      Found by: previous frame's frame pointer
   5  Chromium Framework!__ZN2cc13LayerTreeHost21SetLayerTreeFrameSinkENSt3__110unique_ptrINS_18LayerTreeFrameSinkENS1_14default_deleteIS3_EEEE + 0xc3
      rbp = 0x00007ffee6d536b0   rsp = 0x00007ffee6d53540
      rip = 0x000000011a760c13
      Found by: previous frame's frame pointer
   6  Chromium Framework!__ZN2ui10Compositor21SetLayerTreeFrameSinkENSt3__110unique_ptrIN2cc18LayerTreeFrameSinkENS1_14default_deleteIS4_EEEE + 0x33
      rbp = 0x00007ffee6d536e0   rsp = 0x00007ffee6d536c0
      rip = 0x000000011b386f33
      Found by: previous frame's frame pointer
   7  Chromium Framework!__ZN2ui25HostContextFactoryPrivate19ConfigureCompositorEN4base7WeakPtrINS_10CompositorEEE13scoped_refptrIN3viz15ContextProviderEES5_INS6_21RasterContextProviderEE + 0x5eb
      rbp = 0x00007ffee6d537f0   rsp = 0x00007ffee6d536f0
      rip = 0x0000000116bb08eb
      Found by: previous frame's frame pointer
   8  Chromium Framework!__ZN7content26VizProcessTransportFactory23OnEstablishedGpuChannelEN4base7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEE + 0x7c
      rbp = 0x00007ffee6d53880   rsp = 0x00007ffee6d53800
      rip = 0x0000000116a555cc
      Found by: previous frame's frame pointer
   9  Chromium Framework!__ZN4base8internal13FunctorTraitsIMN7content26VizProcessTransportFactoryEFvNS_7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEEEvE6InvokeISD_NS4_IS3_EEJS7_SB_EEEvT_OT0_DpOT1_ + 0xce
      rbp = 0x00007ffee6d53a00   rsp = 0x00007ffee6d53890
      rip = 0x0000000116a56c1e
      Found by: previous frame's frame pointer
  10  Chromium Framework!__ZN7content28BrowserGpuChannelHostFactory19EstablishGpuChannelEN4base12OnceCallbackIFv13scoped_refptrIN3gpu14GpuChannelHostEEEEE + 0x2bd
      rbp = 0x00007ffee6d53b80   rsp = 0x00007ffee6d53a10
      rip = 0x00000001165a276d
      Found by: previous frame's frame pointer
  11  Chromium Framework!__ZN7content26VizProcessTransportFactory24CreateLayerTreeFrameSinkEN4base7WeakPtrIN2ui10CompositorEEE + 0x184
      rbp = 0x00007ffee6d53d10   rsp = 0x00007ffee6d53b90
      rip = 0x0000000116a55534
      Found by: previous frame's frame pointer
  12  Chromium Framework!__ZN2ui10Compositor28RequestNewLayerTreeFrameSinkEv + 0xb0
      rbp = 0x00007ffee6d53e70   rsp = 0x00007ffee6d53d20
      rip = 0x000000011b388f20
      Found by: previous frame's frame pointer
  13  Chromium Framework!__ZN2cc17SingleThreadProxy28RequestNewLayerTreeFrameSinkEv + 0xcd
      rbp = 0x00007ffee6d53fc0   rsp = 0x00007ffee6d53e80
      rip = 0x000000011a7c745d
      Found by: previous frame's frame pointer

  Traceback (most recent call last):
    File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/testing/serially_executed_browser_test_case.py", line 214, in <lambda>
      return lambda self: based_method(self, *args)
    File "/b/s/w/ir/content/test/gpu/gpu_tests/gpu_integration_test.py", line 138, in _RunGpuTest
      self.RunActualGpuTest(url, *args)
    File "/b/s/w/ir/content/test/gpu/gpu_tests/context_lost_integration_test.py", line 102, in RunActualGpuTest
      getattr(self, test_name)(test_path)
    File "/b/s/w/ir/content/test/gpu/gpu_tests/context_lost_integration_test.py", line 228, in _GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash
      self._KillGPUProcess(2, True)
    File "/b/s/w/ir/content/test/gpu/gpu_tests/context_lost_integration_test.py", line 160, in _KillGPUProcess
      self._CheckCrashCount(tab, expected_kills)
    File "/b/s/w/ir/content/test/gpu/gpu_tests/context_lost_integration_test.py", line 183, in _CheckCrashCount
      system_info = tab.browser.GetSystemInfo()
    File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/browser/browser.py", line 276, in GetSystemInfo
      return self._browser_backend.GetSystemInfo()
    File "/b/s/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 52, in traced_function
      return func(*args, **kwargs)
    File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome/chrome_browser_backend.py", line 240, in GetSystemInfo
      raise exceptions.BrowserConnectionGoneException(self.browser, e)
  BrowserConnectionGoneException: [Errno 54] Connection reset by peer


Cc: markusheintz@chromium.org ccameron@chromium.org
Labels: -Pri-1 Pri-0
Looking at https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng?limit=200 this is rejecting about 1 in 5 tryjobs, which is awful.

This needs a resolution.

(I'm checking in at 9pm since a dry run of one of my own cls was affected)

Skimming  issue 861956 .. these seem to have a unique mechanism for disabling for flakiness. markusheintz - maybe you want to give it a go?
Cc: flackr@chromium.org kbr@chromium.org danakj@chromium.org
 Issue 878504  has been merged into this issue.
vikassoni@ noticed this also in  Issue 878504 . Suppressing the flake in https://chromium-review.googlesource.com/1194334 .

From Vikas' update on the other bug:

most recent log - https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/96596

log when test first failed
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/96517


snapshot of stack trace :

FATAL:compositor.cc(605)] Check failed: false. 
0   Chromium Framework                  0x000000011de777dc base::debug::StackTrace::StackTrace(unsigned long) + 28
1   Chromium Framework                  0x000000011dd7baef logging::LogMessage::~LogMessage() + 223
2   Chromium Framework                  0x000000012046c32c ui::Compositor::DidFailToInitializeLayerTreeFrameSink() + 76
3   Chromium Framework                  0x000000011f84325f cc::LayerTreeHost::DidFailToInitializeLayerTreeFrameSink() + 159
4   Chromium Framework                  0x000000011f8a9910 cc::SingleThreadProxy::SetLayerTreeFrameSink(cc::LayerTreeFrameSink*) + 576
5   Chromium Framework                  0x000000011f842de3 cc::LayerTreeHost::SetLayerTreeFrameSink(std::__1::unique_ptr<cc::LayerTreeFrameSink, std::__1::default_delete<cc::LayerTreeFrameSink> >) + 195
6   Chromium Framework                  0x000000012046a2d3 ui::Compositor::SetLayerTreeFrameSink(std::__1::unique_ptr<cc::LayerTreeFrameSink, std::__1::default_delete<cc::LayerTreeFrameSink> >) + 51
7   Chromium Framework                  0x000000011bc9277b ui::HostContextFactoryPrivate::ConfigureCompositor(base::WeakPtr<ui::Compositor>, scoped_refptr<viz::ContextProvider>, scoped_refptr<viz::RasterContextProvider>) + 1515


It's also flaking on the Mac AMD bots:

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Retina%20Release%20%28AMD%29/38755
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Retina%20Release%20%28AMD%29/38754

There has been some regression in the compositor causing context loss to not be handled gracefully.

Project Member

Comment 5 by bugdroid1@chromium.org, Aug 28

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4f1b8df318b0b58d7f2dd6d29f4a4f019191cc35

commit 4f1b8df318b0b58d7f2dd6d29f4a4f019191cc35
Author: Vikas Soni <vikassoni@chromium.org>
Date: Tue Aug 28 20:59:40 2018

Mark a context_lost test flaky on Mac.

GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash
occasionally crashes in browser process due to some checks failing in
compositor.cc

No-Try: True
Bug: 878258
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Ideb74153dbf7772489042e5cf8ed0ae3d5dc2641
Reviewed-on: https://chromium-review.googlesource.com/1194334
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Kenneth Russell <kbr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#586847}
[modify] https://crrev.com/4f1b8df318b0b58d7f2dd6d29f4a4f019191cc35/content/test/gpu/gpu_tests/context_lost_expectations.py

Cc: fsam...@chromium.org
Components: Internals>Compositing
Labels: -Type-Bug -Pri-0 Pri-1 Type-Bug-Regression
Owner: flackr@chromium.org
https://chromium-review.googlesource.com/1194334 has been merged so these flakes should be suppressed. Downgrading to P1 now, but the regression must still be tracked down and the root cause fixed ASAP (and the flaky test un-suppressed).

flackr@, could you please triage and dispatch this bug as appropriate?

To clarify: this code changed some time in the past couple of days to make this test flaky. These are the two first failing builds on these bots:

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/96517
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Retina%20Release%20%28AMD%29/38670

This test was 100% reliable before that.

Detected 24 new flakes for test/step "gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyigELEgVGbGFrZSJ_Z3B1X3Rlc3RzLmNvbnRleHRfbG9zdF9pbnRlZ3JhdGlvbl90ZXN0LkNvbnRleHRMb3N0SW50ZWdyYXRpb25UZXN0LkdwdUNyYXNoX0dQVVByb2Nlc3NDcmFzaGVzRXhhY3RseU9uY2VQZXJWaXNpdFRvQWJvdXRHcHVDcmFzaAw. This message was posted automatically by the chromium-try-flakes app.
Sheriff ping. Can we disable this test? It has been flaky for a while now & guideline says to disable flaky test within 30 minutes
Owner: ccameron@chromium.org
Reassigning to ccameron@ for Mac. Looks like this flake is caused by Mac OOP-D.
The test hasn't flaked since https://chromium-review.googlesource.com/1194334 landed at r586847. The last reported flake according to chromium-try-flakes was https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/128812 which was run at r586841.

Labels: -Sheriff-Chromium
Removing from the Sheriff queue.
Blocking: 882103
This sounds a lot like an issue danakj@ fixed in https://chromium-review.googlesource.com/1219992 . Examine the stack trace from the crash in the failing shard:


  	Operating system: Mac OS X
  	                  10.13.6 17G65
  	CPU: amd64
  	     family 6 model 70 stepping 1
  	     8 CPUs
  	
  	GPU: UNKNOWN
  	
  	Crash reason:  EXC_BREAKPOINT / EXC_I386_BPT
  	Crash address: 0x103c2e014
  	Process uptime: 9 seconds
  	
  	Thread 0 (crashed)
  	 0  libbase.dylib!__ZN4base5debug13BreakDebuggerEv + 0x14
  	    rax = 0x0000000103ccba1c   rdx = 0x00007ffc8d92da38
  	    rcx = 0x0000000000000015   rbx = 0x00000001038dbd30
  	    rsi = 0x000000000000025d   rdi = 0x0000000103ccba1c
  	    rbp = 0x00007ffeeeab5dc0   rsp = 0x00007ffeeeab5dc0
  	     r8 = 0x00007ffc8d92da4d    r9 = 0x00000000000034b5
  	    r10 = 0x00007ffc8b500000   r11 = 0x0000000103c2e000
  	    r12 = 0x0000000800002e88   r13 = 0x0000000000000001
  	    r14 = 0x00007ffc8b6233a0   r15 = 0x0000000000000000
  	    rip = 0x0000000103c2e014
  	    Found by: given as instruction pointer in context
  	 1  libchrome_dll.dylib!__ZN7logging12_GLOBAL__N_126SilentRuntimeAssertHandlerEPKciN4base16BasicStringPieceINSt3__112basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEEEESC_ + 0x24
  	    rbp = 0x00007ffeeeab5e00   rsp = 0x00007ffeeeab5dd0
  	    rip = 0x000000010e665c04
  	    Found by: previous frame's frame pointer
  	 2  libchrome_dll.dylib!__ZN4base8internal13FunctorTraitsIPFvPKciNS_16BasicStringPieceINSt3__112basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEEEESC_EvE6InvokeIRKSE_JS3_iSC_SC_EEEvOT_DpOT0_ + 0xa1
  	    rbp = 0x00007ffeeeab5e90   rsp = 0x00007ffeeeab5e10
  	    rip = 0x000000010e665f41
  	    Found by: previous frame's frame pointer
  	 3  libchrome_dll.dylib!__ZN4base8internal12InvokeHelperILb0EvE8MakeItSoIRKPFvPKciNS_16BasicStringPieceINSt3__112basic_stringIcNS7_11char_traitsIcEENS7_9allocatorIcEEEEEESE_EJS5_iSE_SE_EEEvOT_DpOT0_ + 0x5d
  	    rbp = 0x00007ffeeeab5ef0   rsp = 0x00007ffeeeab5ea0
  	    rip = 0x000000010e665e8d
  	    Found by: previous frame's frame pointer
  	 4  libchrome_dll.dylib!__ZN4base8internal7InvokerINS0_9BindStateIPFvPKciNS_16BasicStringPieceINSt3__112basic_stringIcNS6_11char_traitsIcEENS6_9allocatorIcEEEEEESD_EJEEESE_E7RunImplIRKSF_RKNS6_5tupleIJEEEJEEEvOT_OT0_NS6_16integer_sequenceImJXspT1_EEEEOS4_OiOSD_SX_ + 0x61
  	    rbp = 0x00007ffeeeab5f60   rsp = 0x00007ffeeeab5f00
  	    rip = 0x000000010e665e21
  	    Found by: previous frame's frame pointer
  	 5  libchrome_dll.dylib!__ZN4base8internal7InvokerINS0_9BindStateIPFvPKciNS_16BasicStringPieceINSt3__112basic_stringIcNS6_11char_traitsIcEENS6_9allocatorIcEEEEEESD_EJEEESE_E3RunEPNS0_13BindStateBaseES4_iOSD_SK_ + 0x84
  	    rbp = 0x00007ffeeeab5fe0   rsp = 0x00007ffeeeab5f70
  	    rip = 0x000000010e665d14
  	    Found by: previous frame's frame pointer
  	 6  libbase.dylib!__ZNKR4base17RepeatingCallbackIFvPKciNS_16BasicStringPieceINSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEEEESB_EE3RunES2_iSB_SB_ + 0x9a
  	    rbp = 0x00007ffeeeab6060   rsp = 0x00007ffeeeab5ff0
  	    rip = 0x000000010387c9ea
  	    Found by: previous frame's frame pointer
  	 7  libbase.dylib!__ZN7logging10LogMessageD2Ev + 0x152b
  	    rbp = 0x00007ffeeeab6fa0   rsp = 0x00007ffeeeab6070
  	    rip = 0x000000010387c44b
  	    Found by: previous frame's frame pointer
  	 8  libbase.dylib!__ZN7logging10LogMessageD1Ev + 0x15
  	    rbp = 0x00007ffeeeab6fc0   rsp = 0x00007ffeeeab6fb0
  	    rip = 0x0000000103878ce5
  	    Found by: previous frame's frame pointer
  	 9  libcompositor.dylib!__ZN2ui10Compositor37DidFailToInitializeLayerTreeFrameSinkEv + 0x73
  	    rbp = 0x00007ffeeeab7110   rsp = 0x00007ffeeeab6fd0
  	    rip = 0x0000000143465e03
  	    Found by: previous frame's frame pointer
  	10  libcc.dylib!__ZN2cc13LayerTreeHost37DidFailToInitializeLayerTreeFrameSinkEv + 0x305
  	    rbp = 0x00007ffeeeab7380   rsp = 0x00007ffeeeab7120
  	    rip = 0x0000000133f0e095
  	    Found by: previous frame's frame pointer
  	11  libcc.dylib!__ZN2cc17SingleThreadProxy21SetLayerTreeFrameSinkEPNS_18LayerTreeFrameSinkE + 0x324
  	    rbp = 0x00007ffeeeab76a0   rsp = 0x00007ffeeeab7390
  	    rip = 0x0000000134092664
  	    Found by: previous frame's frame pointer
  	12  libcc.dylib!__ZN2cc13LayerTreeHost21SetLayerTreeFrameSinkENSt3__110unique_ptrINS_18LayerTreeFrameSinkENS1_14default_deleteIS3_EEEE + 0x4ef
  	    rbp = 0x00007ffeeeab7ad0   rsp = 0x00007ffeeeab76b0
  	    rip = 0x0000000133f0d4ef
  	    Found by: previous frame's frame pointer
  	13  libcompositor.dylib!__ZN2ui10Compositor21SetLayerTreeFrameSinkENSt3__110unique_ptrIN2cc18LayerTreeFrameSinkENS1_14default_deleteIS4_EEEE + 0x222
  	    rbp = 0x00007ffeeeab7cc0   rsp = 0x00007ffeeeab7ae0
  	    rip = 0x000000014345fbe2
  	    Found by: previous frame's frame pointer
  	14  libcontent.dylib!__ZN2ui25HostContextFactoryPrivate19ConfigureCompositorEN4base7WeakPtrINS_10CompositorEEE13scoped_refptrIN3viz15ContextProviderEES5_INS6_21RasterContextProviderEE + 0x19ea
  	    rbp = 0x00007ffeeeab8a30   rsp = 0x00007ffeeeab7cd0
  	    rip = 0x0000000126b6f37a
  	    Found by: previous frame's frame pointer
  	15  libcontent.dylib!__ZN7content26VizProcessTransportFactory23OnEstablishedGpuChannelEN4base7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEE + 0x2a8
  	    rbp = 0x00007ffeeeab8b30   rsp = 0x00007ffeeeab8a40
  	    rip = 0x00000001260ae398
  	    Found by: previous frame's frame pointer
  	16  libcontent.dylib!__ZN4base8internal13FunctorTraitsIMN7content26VizProcessTransportFactoryEFvNS_7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEEEvE6InvokeISD_NS4_IS3_EEJS7_SB_EEEvT_OT0_DpOT1_ + 0xd8
  	    rbp = 0x00007ffeeeab8bd0   rsp = 0x00007ffeeeab8b40
  	    rip = 0x00000001260b25d8
  	    Found by: previous frame's frame pointer
  	17  libcontent.dylib!__ZN4base8internal12InvokeHelperILb1EvE8MakeItSoIMN7content26VizProcessTransportFactoryEFvNS_7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEEENS6_IS5_EEJS9_SD_EEEvOT_OT0_DpOT1_ + 0x85
  	    rbp = 0x00007ffeeeab8c40   rsp = 0x00007ffeeeab8be0
  	    rip = 0x00000001260b24c5
  	    Found by: previous frame's frame pointer
  	18  libcontent.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMN7content26VizProcessTransportFactoryEFvNS_7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEEEJNS5_IS4_EES8_EEEFvSC_EE7RunImplISE_NSt3__15tupleIJSF_S8_EEEJLm0ELm1EEEEvOT_OT0_NSK_16integer_sequenceImJXspT1_EEEEOSC_ + 0x8d
  	    rbp = 0x00007ffeeeab8cc0   rsp = 0x00007ffeeeab8c50
  	    rip = 0x00000001260b242d
  	    Found by: previous frame's frame pointer
  	19  libcontent.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMN7content26VizProcessTransportFactoryEFvNS_7WeakPtrIN2ui10CompositorEEE13scoped_refptrIN3gpu14GpuChannelHostEEEJNS5_IS4_EES8_EEEFvSC_EE7RunOnceEPNS0_13BindStateBaseEOSC_ + 0x49
  	    rbp = 0x00007ffeeeab8d10   rsp = 0x00007ffeeeab8cd0
  	    rip = 0x00000001260b2329
  	    Found by: previous frame's frame pointer
  	20  libcontent.dylib!__ZNO4base12OnceCallbackIFv13scoped_refptrIN3gpu14GpuChannelHostEEEE3RunES4_ + 0x6f
  	    rbp = 0x00007ffeeeab8d60   rsp = 0x00007ffeeeab8d20
  	    rip = 0x0000000124f44c4f
  	    Found by: previous frame's frame pointer
  	21  libcontent.dylib!__ZN7content28BrowserGpuChannelHostFactory19EstablishGpuChannelEN4base12OnceCallbackIFv13scoped_refptrIN3gpu14GpuChannelHostEEEEE + 0x2e3
  	    rbp = 0x00007ffeeeab9170   rsp = 0x00007ffeeeab8d70
  	    rip = 0x0000000124f46363
  	    Found by: previous frame's frame pointer
  	22  libcontent.dylib!__ZN7content26VizProcessTransportFactory24CreateLayerTreeFrameSinkEN4base7WeakPtrIN2ui10CompositorEEE + 0xf1
  	    rbp = 0x00007ffeeeab91f0   rsp = 0x00007ffeeeab9180
  	    rip = 0x00000001260ae091
  	    Found by: previous frame's frame pointer
  	23  libcompositor.dylib!__ZN2ui10Compositor28RequestNewLayerTreeFrameSinkEv + 0x10e
  	    rbp = 0x00007ffeeeab9370   rsp = 0x00007ffeeeab9200
  	    rip = 0x0000000143465d4e
  	    Found by: previous frame's frame pointer
  	24  libcc.dylib!__ZN2cc13LayerTreeHost28RequestNewLayerTreeFrameSinkEv + 0x1a
  	    rbp = 0x00007ffeeeab9390   rsp = 0x00007ffeeeab9380
  	    rip = 0x0000000133f0da8a
  	    Found by: previous frame's frame pointer
  	25  libcc.dylib!__ZN2cc17SingleThreadProxy28RequestNewLayerTreeFrameSinkEv + 0xf9
  	    rbp = 0x00007ffeeeab94f0   rsp = 0x00007ffeeeab93a0
  	    rip = 0x0000000134092179
  	    Found by: previous frame's frame pointer
  	26  libcc.dylib!__ZN4base8internal13FunctorTraitsIMN2cc17SingleThreadProxyEFvvEvE6InvokeIS5_RKNS_7WeakPtrIS3_EEJEEEvT_OT0_DpOT1_ + 0x7f
  	    rbp = 0x00007ffeeeab9540   rsp = 0x00007ffeeeab9500
  	    rip = 0x000000013409ad7f
  	    Found by: previous frame's frame pointer
  	27  libcc.dylib!__ZN4base8internal12InvokeHelperILb1EvE8MakeItSoIRKMN2cc17SingleThreadProxyEFvvERKNS_7WeakPtrIS5_EEJEEEvOT_OT0_DpOT1_ + 0x5a
  	    rbp = 0x00007ffeeeab9580   rsp = 0x00007ffeeeab9550
  	    rip = 0x000000013409ac9a
  	    Found by: previous frame's frame pointer
  	28  libcc.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMN2cc17SingleThreadProxyEFvvEJNS_7WeakPtrIS4_EEEEEFvvEE7RunImplIRKS6_RKNSt3__15tupleIJS8_EEEJLm0EEEEvOT_OT0_NSF_16integer_sequenceImJXspT1_EEEE + 0x50
  	    rbp = 0x00007ffeeeab95d0   rsp = 0x00007ffeeeab9590
  	    rip = 0x000000013409ac30
  	    Found by: previous frame's frame pointer
  	29  libcc.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMN2cc17SingleThreadProxyEFvvEJNS_7WeakPtrIS4_EEEEEFvvEE3RunEPNS0_13BindStateBaseE + 0x2c
  	    rbp = 0x00007ffeeeab9600   rsp = 0x00007ffeeeab95e0
  	    rip = 0x000000013409ab6c
  	    Found by: previous frame's frame pointer
  	30  libcc.dylib!__ZNKR4base17RepeatingCallbackIFvvEE3RunEv + 0x3d
  	    rbp = 0x00007ffeeeab9630   rsp = 0x00007ffeeeab9610
  	    rip = 0x0000000133c98d2d
  	    Found by: previous frame's frame pointer
  	31  libcc.dylib!__ZN4base8internal22CancelableCallbackImplINS_17RepeatingCallbackIFvvEEEE16ForwardRepeatingIJEEEvDpT_ + 0x15
  	    rbp = 0x00007ffeeeab9650   rsp = 0x00007ffeeeab9640
  	    rip = 0x0000000133c98bc5
  	    Found by: previous frame's frame pointer
  	32  libcc.dylib!__ZN4base8internal13FunctorTraitsIMNS0_22CancelableCallbackImplINS_17RepeatingCallbackIFvvEEEEEFvvEvE6InvokeIS8_RKNS_7WeakPtrIS6_EEJEEEvT_OT0_DpOT1_ + 0x7f
  	    rbp = 0x00007ffeeeab96a0   rsp = 0x00007ffeeeab9660
  	    rip = 0x0000000133c98f7f
  	    Found by: previous frame's frame pointer
  	33  libcc.dylib!__ZN4base8internal12InvokeHelperILb1EvE8MakeItSoIRKMNS0_22CancelableCallbackImplINS_17RepeatingCallbackIFvvEEEEEFvvERKNS_7WeakPtrIS8_EEJEEEvOT_OT0_DpOT1_ + 0x5a
  	    rbp = 0x00007ffeeeab96e0   rsp = 0x00007ffeeeab96b0
  	    rip = 0x0000000133c98e9a
  	    Found by: previous frame's frame pointer
  	34  libcc.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMNS0_22CancelableCallbackImplINS_17RepeatingCallbackIFvvEEEEEFvvEJNS_7WeakPtrIS7_EEEEES5_E7RunImplIRKS9_RKNSt3__15tupleIJSB_EEEJLm0EEEEvOT_OT0_NSH_16integer_sequenceImJXspT1_EEEE + 0x50
  	    rbp = 0x00007ffeeeab9730   rsp = 0x00007ffeeeab96f0
  	    rip = 0x0000000133c98e30
  	    Found by: previous frame's frame pointer
  	35  libcc.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMNS0_22CancelableCallbackImplINS_17RepeatingCallbackIFvvEEEEEFvvEJNS_7WeakPtrIS7_EEEEES5_E3RunEPNS0_13BindStateBaseE + 0x2c
  	    rbp = 0x00007ffeeeab9760   rsp = 0x00007ffeeeab9740
  	    rip = 0x0000000133c98d6c
  	    Found by: previous frame's frame pointer
  	36  libaccelerated_widget_mac.dylib!__ZNO4base12OnceCallbackIFvvEE3RunEv + 0x5c
  	    rbp = 0x00007ffeeeab97a0   rsp = 0x00007ffeeeab9770
  	    rip = 0x0000000147328dbc
  	    Found by: previous frame's frame pointer
  	37  libaccelerated_widget_mac.dylib!__ZN2ui12_GLOBAL__N_111WrappedTask3RunEv + 0x41
  	    rbp = 0x00007ffeeeab97d0   rsp = 0x00007ffeeeab97b0
  	    rip = 0x0000000147327601
  	    Found by: previous frame's frame pointer
  	38  libaccelerated_widget_mac.dylib!__ZN4base8internal13FunctorTraitsIMN2ui12_GLOBAL__N_111WrappedTaskEFvvEvE6InvokeIS6_PS4_JEEEvT_OT0_DpOT1_ + 0x7d
  	    rbp = 0x00007ffeeeab9820   rsp = 0x00007ffeeeab97e0
  	    rip = 0x000000014732869d
  	    Found by: previous frame's frame pointer
  	39  libaccelerated_widget_mac.dylib!__ZN4base8internal12InvokeHelperILb0EvE8MakeItSoIRKMN2ui12_GLOBAL__N_111WrappedTaskEFvvEJPS6_EEEvOT_DpOT0_ + 0x44
  	    rbp = 0x00007ffeeeab9860   rsp = 0x00007ffeeeab9830
  	    rip = 0x00000001473285e4
  	    Found by: previous frame's frame pointer
  	40  libaccelerated_widget_mac.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMN2ui12_GLOBAL__N_111WrappedTaskEFvvEJNS0_12OwnedWrapperIS5_EEEEEFvvEE7RunImplIRKS7_RKNSt3__15tupleIJS9_EEEJLm0EEEEvOT_OT0_NSG_16integer_sequenceImJXspT1_EEEE + 0x63
  	    rbp = 0x00007ffeeeab98c0   rsp = 0x00007ffeeeab9870
  	    rip = 0x0000000147328573
  	    Found by: previous frame's frame pointer
  	41  libaccelerated_widget_mac.dylib!__ZN4base8internal7InvokerINS0_9BindStateIMN2ui12_GLOBAL__N_111WrappedTaskEFvvEJNS0_12OwnedWrapperIS5_EEEEEFvvEE3RunEPNS0_13BindStateBaseE + 0x2c
  	    rbp = 0x00007ffeeeab98f0   rsp = 0x00007ffeeeab98d0
  	    rip = 0x000000014732846c
  	    Found by: previous frame's frame pointer
  	42  libbase.dylib!__ZNO4base12OnceCallbackIFvvEE3RunEv + 0x5c
  	    rbp = 0x00007ffeeeab9930   rsp = 0x00007ffeeeab9900
  	    rip = 0x00000001037b3f5c
  	    Found by: previous frame's frame pointer
  	43  libbase.dylib!__ZN4base5debug13TaskAnnotator7RunTaskEPKcPNS_11PendingTaskE + 0x409
  	    rbp = 0x00007ffeeeab9b10   rsp = 0x00007ffeeeab9940
  	    rip = 0x0000000103810339
  	    Found by: previous frame's frame pointer

Blocking: -882103
Blockedon: 882103
Project Member

Comment 17 by Findit, Oct 11

Flaky-Test: gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash
Labels: Type-Bug Test-Flaky Test-Findit-Detected Sheriff-Chromium

gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash is flaky.

Findit has detected 3 new flake occurrences of this test. List
of all flake occurrences can be found at:
https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyswELEgVGbGFrZSKnAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMuY29udGV4dF9sb3N0X2ludGVncmF0aW9uX3Rlc3QuQ29udGV4dExvc3RJbnRlZ3JhdGlvblRlc3QuR3B1Q3Jhc2hfR1BVUHJvY2Vzc0NyYXNoZXNFeGFjdGx5T25jZVBlclZpc2l0VG9BYm91dEdwdUNyYXNoDA.

Since this test is still flaky, this issue has been moved back onto the Sheriff
Bug Queue if it's not already there.

If the result above is wrong, please file a bug using this link:
https://bugs.chromium.org/p/chromium/issues/entry?status=Unconfirmed&labels=Pri-1,Test-Findit-Wrong&components=Tools%3ETest%3EFindit%3EFlakiness&summary=%5BFindit%5D%20Flake%20Detection%20-%20Wrong%20result%20for%20gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash&comment=Link%20to%20flake%20occurrences%3A%20https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyswELEgVGbGFrZSKnAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMuY29udGV4dF9sb3N0X2ludGVncmF0aW9uX3Rlc3QuQ29udGV4dExvc3RJbnRlZ3JhdGlvblRlc3QuR3B1Q3Jhc2hfR1BVUHJvY2Vzc0NyYXNoZXNFeGFjdGx5T25jZVBlclZpc2l0VG9BYm91dEdwdUNyYXNoDA

Automatically posted by the findit-for-me app (https://goo.gl/Ot9f7N).
It's now failing on Win. Should it also be marked flaky there?
Project Member

Comment 19 by Findit, Oct 12


gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash is flaky.

Findit has detected 5 new flake occurrences of this test. List
of all flake occurrences can be found at:
https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyswELEgVGbGFrZSKnAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMuY29udGV4dF9sb3N0X2ludGVncmF0aW9uX3Rlc3QuQ29udGV4dExvc3RJbnRlZ3JhdGlvblRlc3QuR3B1Q3Jhc2hfR1BVUHJvY2Vzc0NyYXNoZXNFeGFjdGx5T25jZVBlclZpc2l0VG9BYm91dEdwdUNyYXNoDA.

Since this test is still flaky, this issue has been moved back onto the Sheriff
Bug Queue if it's not already there.

If the result above is wrong, please file a bug using this link:
https://bugs.chromium.org/p/chromium/issues/entry?status=Unconfirmed&labels=Pri-1,Test-Findit-Wrong&components=Tools%3ETest%3EFindit%3EFlakiness&summary=%5BFindit%5D%20Flake%20Detection%20-%20Wrong%20result%20for%20gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash&comment=Link%20to%20flake%20occurrences%3A%20https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyswELEgVGbGFrZSKnAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMuY29udGV4dF9sb3N0X2ludGVncmF0aW9uX3Rlc3QuQ29udGV4dExvc3RJbnRlZ3JhdGlvblRlc3QuR3B1Q3Jhc2hfR1BVUHJvY2Vzc0NyYXNoZXNFeGFjdGx5T25jZVBlclZpc2l0VG9BYm91dEdwdUNyYXNoDA

Automatically posted by the findit-for-me app (https://goo.gl/Ot9f7N).
Sheriff here, should we mark the test as flaky on win?
Cc: jdarpinian@chromium.org
Components: Internals>Skia
Labels: Hotlist-PixelWrangler OS-Windows
Owner: bsalo...@google.com
The failures on Windows have a different root cause than the ones on Mac for which this bug was originally filed.

https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/106046
https://chromium-swarm.appspot.com/task?id=4081918c5fdf9c10&refresh=10&show_raw=1

  	Last event: 588.18e8: Break instruction exception - code 80000003 (first/second chance not available)
  	  debugger time: Fri Oct 12 07:51:05.914 2018 (UTC - 7:00)
  	ChildEBP RetAddr  Args to Child              
  	085fde14 6b053e0d 6dd7a789 00000230 05268a93 chrome_child!base::debug::BreakDebugger+0xc
  	085fde34 6ab42693 051764a0 6dd7a789 00000230 chrome_child!?Run@?$Invoker@U?$BindState@P6AXPBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@base@@1@Z$$V@internal@base@@$$A6AXPBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@3@1@Z@internal@base@@SAXPAVBindStateBase@23@PBDH$$QAV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@3@2@Z+0x1f
  	085fe358 6aef006e 051f4d50 051f4d50 00000003 chrome_child!logging::LogMessage::~LogMessage+0x483
  	085fe438 6969c4f8 00000000 08029348 080187ec chrome_child!gpu::gles2::GLES2Implementation::DeleteShader+0x10e
  	085fe450 6b3c5ce6 080187ec 00000000 051f4d50 chrome_child!gpu::gles2::GLES2Interface::`vcall'{1400}'+0xc2
  	085fe534 6b3bd0f5 00c101d0 00000016 00c10000 chrome_child!GrGLGpu::createClearColorProgram+0x736
  	085fe578 6b3bcc7a 0805db64 3f800000 3f800000 chrome_child!GrGLGpu::clearColorAsDraw+0x25
  	085fe5bc 6b3c73d9 0805db64 ffffffff 051eccd0 chrome_child!GrGLGpu::clear+0x11a
  	085fe5d4 6b3cd721 0805db64 ffffffff 0805db40 chrome_child!GrGLGpuRTCommandBuffer::onClear+0x19
  	085fe5ec 6b3cb3d6 085fe6f8 085fe648 085fe620 chrome_child!GrClearOp::onExecute+0x51
  	085fe62c 6b3cb278 085fe6f8 00000000 0805d8e8 chrome_child!GrOp::execute+0xa6
  	085fe698 6b3ad0c6 085fe6f8 07f1b028 07f1b008 chrome_child!GrRenderTargetOpList::onExecute+0x3b8
  	085fe6bc 6b3ac778 00000000 00000001 085fe6f8 chrome_child!GrDrawingManager::executeOpLists+0x3a6
  	085ff498 6b3ad2b9 0804a8e0 00000000 00000000 chrome_child!GrDrawingManager::flush+0x768
  	085ff4bc 6adf8368 0804a8e0 00000000 00000000 chrome_child!GrDrawingManager::prepareSurfaceForExternalIO+0x99
  	085ff4fc 6ae2981a 00000000 00000000 085ff5a8 chrome_child!GrRenderTargetContext::prepareForExternalIO+0xf8
  	085ff518 692aef3c 00000000 00000000 085ff538 chrome_child!SkGpuDevice::flushAndSignalSemaphores+0x2a
  	085ff528 6aec0b22 051e1744 085ff6e8 085ff67c chrome_child!SkSurface::flush+0xc
  	085ff538 6c71a9bf 6abad8af 051e1744 08062440 chrome_child!viz::ClientResourceProvider::ScopedSkSurface::~ScopedSkSurface+0x12
  	085ff67c 6c71a27e 085ff6e8 07fc62ed 00000de1 chrome_child!cc::GpuRasterBufferProvider::PlaybackOnWorkerThread+0x63f
  	085ff720 6c6fe349 07fcdcb0 07fe2454 07fe2464 chrome_child!cc::GpuRasterBufferProvider::RasterBufferImpl::Playback+0xde
  	085ff878 6c4a53d9 051d7000 051d7024 07fe2400 chrome_child!std::list<std::pair<unsigned __int64 const ,std::vector<cc::DrawImage,std::allocator<cc::DrawImage> > >,std::allocator<std::pair<unsigned __int64 const ,std::vector<cc::DrawImage,std::allocator<cc::DrawImage> > > > >::erase+0x269
 

Brian, could you please look into this regression? Ganesh has to handle context loss gracefully. Thanks.

James, could you add a temporary flaky expectation for this test on Windows, to be removed when the fix for this in Skia is rolled forward into Chromium? Thanks.

Labels: -Sheriff-Chromium
I don't really understand how this is a Skia issue. Ganesh requires GrContext:::abandonContext be called if the context is lost in which case we stop making GL calls. It looks like that didn't happen here, unless I'm missing something.
Project Member

Comment 24 by bugdroid1@chromium.org, Oct 12

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0a8d18e3a8a26a37ce6bc0bda923838b1ac3e74b

commit 0a8d18e3a8a26a37ce6bc0bda923838b1ac3e74b
Author: James Darpinian <jdarpinian@chromium.org>
Date: Fri Oct 12 21:26:20 2018

Temporarily mark test flaky until Ganesh fix is made.

GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash

TBR: kbr@chromium.org
Bug: 878258
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I5ffe0fc31010f48838288a09b434235aedccb49b
Reviewed-on: https://chromium-review.googlesource.com/c/1278812
Reviewed-by: James Darpinian <jdarpinian@chromium.org>
Commit-Queue: James Darpinian <jdarpinian@chromium.org>
Cr-Commit-Position: refs/heads/master@{#599359}
[modify] https://crrev.com/0a8d18e3a8a26a37ce6bc0bda923838b1ac3e74b/content/test/gpu/gpu_tests/context_lost_expectations.py

Cc: enne@chromium.org
Skia is flushing because a ScopedSkSurface is going out of scope. For this not to have crashed before I think one of these conditions must have been true:

1) ~ScopedSkSurface was happening before the context was lost
2) ~ScopedSkSurface was happening after both context lost and GrContext::abandonContext was called.
or
3) There was no work on GrContext queued when ~ScopedSkSurface happened because something else was flushing GrContext before the context was lost.

For one of these conditions to have changed and made this flaky indicates a Chrome change.

Cc: bsalo...@google.com
Components: Internals>GPU>Rasterization
Owner: enne@chromium.org
Chrome's command buffer is supposed to guarantee that even if the context is lost, GL calls made in the renderer process won't crash, just become no-ops and perhaps generate a CONTEXT_LOST GL error.

Unfortunately since context loss happens asynchronously, it can really happen between any two GL calls, though this isn't supposed to break anything. It's supposed to eventually be detected and things to recover later.

I'm having a hard time finding where GLES2Implementation::DeleteShader and its callees are logging a message which causes the renderer process to crash. Can anyone see where that is happening? Does it actually look like this work is being done after the GLES2Implementation has been torn down?

enne, may I assign this to you? It sounds like it's more related to GPU rasterization than a problem in Skia.


Project Member

Comment 27 by Findit, Oct 13

Labels: Sheriff-Chromium

gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash is flaky.

Findit has detected 3 new flake occurrences of this test. List
of all flake occurrences can be found at:
https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyswELEgVGbGFrZSKnAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMuY29udGV4dF9sb3N0X2ludGVncmF0aW9uX3Rlc3QuQ29udGV4dExvc3RJbnRlZ3JhdGlvblRlc3QuR3B1Q3Jhc2hfR1BVUHJvY2Vzc0NyYXNoZXNFeGFjdGx5T25jZVBlclZpc2l0VG9BYm91dEdwdUNyYXNoDA.

Since this test is still flaky, this issue has been moved back onto the Sheriff
Bug Queue if it's not already there.

If the result above is wrong, please file a bug using this link:
https://bugs.chromium.org/p/chromium/issues/entry?status=Unconfirmed&labels=Pri-1,Test-Findit-Wrong&components=Tools%3ETest%3EFindit%3EFlakiness&summary=%5BFindit%5D%20Flake%20Detection%20-%20Wrong%20result%20for%20gpu_tests.context_lost_integration_test.ContextLostIntegrationTest.GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash&comment=Link%20to%20flake%20occurrences%3A%20https://findit-for-me.appspot.com/flake/occurrences?key=ag9zfmZpbmRpdC1mb3ItbWVyswELEgVGbGFrZSKnAWNocm9taXVtQHRlbGVtZXRyeV9ncHVfaW50ZWdyYXRpb25fdGVzdEBncHVfdGVzdHMuY29udGV4dF9sb3N0X2ludGVncmF0aW9uX3Rlc3QuQ29udGV4dExvc3RJbnRlZ3JhdGlvblRlc3QuR3B1Q3Jhc2hfR1BVUHJvY2Vzc0NyYXNoZXNFeGFjdGx5T25jZVBlclZpc2l0VG9BYm91dEdwdUNyYXNoDA

Automatically posted by the findit-for-me app (https://goo.gl/Ot9f7N).
The newly found flakes predate James's suppression.

Labels: -Sheriff-Chromium
Sure, I'll try to take a look when I can.
Cc: piman@chromium.org
Naively looking at the code, the only obvious crashes here (DCHECK is probably the log) would be if the shader id were 0 during DeleteShader.  The command buffer never generates a zero id (because these are all client side ids), however https://cs.chromium.org/chromium/src/third_party/skia/src/gpu/gl/builders/GrGLShaderStringBuilder.cpp?type=cs&sq=package:chromium&g=0&l=161 suspiciously looks like if program compilation fails due to context lost, then GrGLCompileAndAttachShader will return a shader id of zero.

The gl spec says that DeleteShader(0) just causes a gl error.  It seems a bit to me like a DCHECK is overblown here, and the GLES2Implementation::DeleteShaderHelper function will already throw a gl error when it can't find a zero id.

I wasn't able to repro this locally on a win nvidia machine, so this is just from reading the code.
Agreed that this DCHECK doesn't belong. The ES spec even says "DeleteShader will silently ignore the value zero.", so it shouldn't even raise a GL error.
This flaky test is making the ANGLE CQ unstable. We should suppress it temporarily. Also probably the long term fix is to remove more Chrome-specific tests from the ANGLE CQ.
My best guess is that that DCHECK was an attempt to catch potentially-incorrect, Chrome-internal code.

Which path should we take? Update:
https://cs.chromium.org/chromium/src/third_party/skia/src/gpu/gl/builders/GrGLShaderStringBuilder.cpp?type=cs&sq=package:chromium&g=0&l=160

to test the shader and not try to delete 0, or update:
https://cs.chromium.org/chromium/src/gpu/command_buffer/client/gles2_implementation_impl_autogen.h?type=cs&q=GLES2Implementation::DeleteShader&sq=package:chromium&g=0&l=560

and remove the DCHECK?

Given the intent of the DCHECK I have a slight preference to updating Ganesh.

We can update Ganesh sure, but we should follow the spec.
https://chromium-review.googlesource.com/c/chromium/src/+/1285317 removes the DCHECK, and makes it consistent with other ids (e.g. textures, buffers).
If there is interest, we could replace it by a DLOG, but I don't think I see value. Client code can use DCHECK before calling glDeleteShader if they want to catch this case.
Project Member

Comment 36 by bugdroid1@chromium.org, Oct 17

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/382ffc533cb64b0946adfbc617858a6203207d03

commit 382ffc533cb64b0946adfbc617858a6203207d03
Author: Adrienne Walker <enne@chromium.org>
Date: Wed Oct 17 17:54:55 2018

gpu: silently ignore deleting program, shader, sync 0

This is according to the gpu spec, and should fix a crash during context
lost.

Bug: 878258
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Icabb95055de11c7a743144288898428ecd36bc90
Reviewed-on: https://chromium-review.googlesource.com/c/1285317
Reviewed-by: Antoine Labour <piman@chromium.org>
Commit-Queue: enne <enne@chromium.org>
Cr-Commit-Position: refs/heads/master@{#600468}
[modify] https://crrev.com/382ffc533cb64b0946adfbc617858a6203207d03/gpu/command_buffer/build_cmd_buffer_lib.py
[modify] https://crrev.com/382ffc533cb64b0946adfbc617858a6203207d03/gpu/command_buffer/client/gles2_implementation_impl_autogen.h
[modify] https://crrev.com/382ffc533cb64b0946adfbc617858a6203207d03/gpu/command_buffer/client/gles2_implementation_unittest.cc

Project Member

Comment 37 by bugdroid1@chromium.org, Oct 17

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f16c4808cd102369384466b1ee4c73193dbf86b7

commit f16c4808cd102369384466b1ee4c73193dbf86b7
Author: Jamie Madill <jmadill@chromium.org>
Date: Wed Oct 17 18:35:19 2018

Upgrade context lost expectation to fail.

GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash

This test was so flaky it was failing on the ANGLE CQ. Also affects
Intel and possibly AMD.

Tbr: kbr@chromium.org
Bug: 878258
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I74685596e4162d1d8244ac48f60fcbd12319159d
Reviewed-on: https://chromium-review.googlesource.com/c/1286892
Reviewed-by: Jamie Madill <jmadill@chromium.org>
Commit-Queue: Jamie Madill <jmadill@chromium.org>
Cr-Commit-Position: refs/heads/master@{#600489}
[modify] https://crrev.com/f16c4808cd102369384466b1ee4c73193dbf86b7/content/test/gpu/gpu_tests/context_lost_expectations.py

Project Member

Comment 38 by bugdroid1@chromium.org, Oct 18

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8bd7e58308cc50ef63e533e699c35fe11cc49582

commit 8bd7e58308cc50ef63e533e699c35fe11cc49582
Author: Adrienne Walker <enne@chromium.org>
Date: Thu Oct 18 17:36:36 2018

Reenable GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash

This was flaky, but should be fixed.

Bug: 878258
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I3e037e41339ddff9d46ec0a7ca7679064a3322e0
Reviewed-on: https://chromium-review.googlesource.com/c/1285526
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: enne <enne@chromium.org>
Cr-Commit-Position: refs/heads/master@{#600813}
[modify] https://crrev.com/8bd7e58308cc50ef63e533e699c35fe11cc49582/content/test/gpu/gpu_tests/context_lost_expectations.py

Project Member

Comment 39 by bugdroid1@chromium.org, Oct 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a337f08fa698a27fd20c104f61e1bfee81ba37fe

commit a337f08fa698a27fd20c104f61e1bfee81ba37fe
Author: Christian Dullweber <dullweber@chromium.org>
Date: Fri Oct 19 10:05:50 2018

Revert "Reenable GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash"

This reverts commit 8bd7e58308cc50ef63e533e699c35fe11cc49582.

Reason for revert: Still flaky :( https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/166516

Original change's description:
> Reenable GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash
> 
> This was flaky, but should be fixed.
> 
> Bug: 878258
> Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
> Change-Id: I3e037e41339ddff9d46ec0a7ca7679064a3322e0
> Reviewed-on: https://chromium-review.googlesource.com/c/1285526
> Reviewed-by: Kenneth Russell <kbr@chromium.org>
> Commit-Queue: enne <enne@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#600813}

TBR=kbr@chromium.org,enne@chromium.org

Change-Id: I0dd172bd7db86982f9b1bf49e2a57c059958d348
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: 878258
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Reviewed-on: https://chromium-review.googlesource.com/c/1290950
Reviewed-by: Christian Dullweber <dullweber@chromium.org>
Commit-Queue: Christian Dullweber <dullweber@chromium.org>
Cr-Commit-Position: refs/heads/master@{#601098}
[modify] https://crrev.com/a337f08fa698a27fd20c104f61e1bfee81ba37fe/content/test/gpu/gpu_tests/context_lost_expectations.py

The revert of that re-enable wouldn't have affected this try job. The try job failed on macOS, but the re-enable only affected the Windows platform. There was a previous flaky expectation for this test on macOS.

From the failing shard:

https://chromium-swarm.appspot.com/task?id=40a456d46556fe10&refresh=10&show_raw=1

It looks like this DCHECK is the reason the test failed:

[16090:775:1019/015314.549645:FATAL:compositor.cc(618)] Check failed: false. 
0   Chromium Framework                  0x000000011001ab9f base::debug::StackTrace::StackTrace(unsigned long) + 31
1   Chromium Framework                  0x000000010ff1b7af logging::LogMessage::~LogMessage() + 223
2   Chromium Framework                  0x000000011267556c ui::Compositor::DidFailToInitializeLayerTreeFrameSink() + 76
3   Chromium Framework                  0x0000000111a2bc9f cc::LayerTreeHost::DidFailToInitializeLayerTreeFrameSink() + 159
4   Chromium Framework                  0x0000000111a934b0 cc::SingleThreadProxy::SetLayerTreeFrameSink(cc::LayerTreeFrameSink*) + 576
5   Chromium Framework                  0x0000000111a2b89b cc::LayerTreeHost::SetLayerTreeFrameSink(std::__1::unique_ptr<cc::LayerTreeFrameSink, std::__1::default_delete<cc::LayerTreeFrameSink> >) + 251
6   Chromium Framework                  0x0000000112673503 ui::Compositor::SetLayerTreeFrameSink(std::__1::unique_ptr<cc::LayerTreeFrameSink, std::__1::default_delete<cc::LayerTreeFrameSink> >) + 51
7   Chromium Framework                  0x000000010d930dd4 ui::HostContextFactoryPrivate::ConfigureCompositor(ui::Compositor*, scoped_refptr<viz::ContextProvider>, scoped_refptr<viz::RasterContextProvider>) + 1508
8   Chromium Framework                  0x000000010d7bd6e4 content::VizProcessTransportFactory::OnEstablishedGpuChannel(base::WeakPtr<ui::Compositor>, scoped_refptr<gpu::GpuChannelHost>) + 100
9   Chromium Framework                  0x000000010d7bed8e void base::internal::FunctorTraits<void (content::VizProcessTransportFactory::*)(base::WeakPtr<ui::Compositor>, scoped_refptr<gpu::GpuChannelHost>), void>::Invoke<void (content::VizProcessTransportFactory::*)(base::WeakPtr<ui::Compositor>, scoped_refptr<gpu::GpuChannelHost>), base::WeakPtr<content::VizProcessTransportFactory>, base::WeakPtr<ui::Compositor>, scoped_refptr<gpu::GpuChannelHost> >(void (content::VizProcessTransportFactory::*)(base::WeakPtr<ui::Compositor>, scoped_refptr<gpu::GpuChannelHost>), base::WeakPtr<content::VizProcessTransportFactory>&&, base::WeakPtr<ui::Compositor>&&, scoped_refptr<gpu::GpuChannelHost>&&) + 206
10  Chromium Framework                  0x000000010d2ddbed content::BrowserGpuChannelHostFactory::EstablishGpuChannel(base::OnceCallback<void (scoped_refptr<gpu::GpuChannelHost>)>) + 701
11  Chromium Framework                  0x000000010d7bd666 content::VizProcessTransportFactory::CreateLayerTreeFrameSink(base::WeakPtr<ui::Compositor>) + 422
12  Chromium Framework                  0x0000000112675500 ui::Compositor::RequestNewLayerTreeFrameSink() + 176
13  Chromium Framework                  0x0000000111a931cd cc::SingleThreadProxy::RequestNewLayerTreeFrameSink() + 205
14  Chromium Framework                  0x0000000111a975f7 base::internal::Invoker<base::internal::BindState<void (cc::SingleThreadProxy::*)(), base::WeakPtr<cc::SingleThreadProxy> >, void ()>::Run(base::internal::BindStateBase*) + 183
15  Chromium Framework                  0x000000010c4a626f void base::internal::CancelableCallbackImpl<base::RepeatingCallback<void ()> >::ForwardRepeating<>() + 95
16  Chromium Framework                  0x000000010c4a6337 base::internal::Invoker<base::internal::BindState<void (base::internal::CancelableCallbackImpl<base::RepeatingCallback<void ()> >::*)(), base::WeakPtr<base::internal::CancelableCallbackImpl<base::RepeatingCallback<void ()> > > >, void ()>::Run(base::internal::BindStateBase*) + 183
17  Chromium Framework                  0x000000010d8d14ba base::OnceCallback<void ()>::Run() && + 106


Again, this sounds like the same issue that danakj@ fixed in https://chromium-review.googlesource.com/1219992 and  Issue 882103 . Do we know why this is still crashing?

It should be safe to re-land the re-enabling of this test on Windows. I'll try to do that.

Cc: kylec...@chromium.org
Project Member

Comment 42 by bugdroid1@chromium.org, Oct 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0d08508e73f47cde5cfb592e16b644a5bc7e9d44

commit 0d08508e73f47cde5cfb592e16b644a5bc7e9d44
Author: Kenneth Russell <kbr@chromium.org>
Date: Fri Oct 19 20:00:26 2018

Reland "Reenable GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash"

This reverts commit a337f08fa698a27fd20c104f61e1bfee81ba37fe.

Reason for revert: this CL only re-enables the test on Windows; the
failure seen was on macOS, which is a different platform and had a
preexisting flaky expectation for this test. Investigation will
continue on the bug into the flakiness on that platform.

Original change's description:
> Revert "Reenable GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash"
> 
> This reverts commit 8bd7e58308cc50ef63e533e699c35fe11cc49582.
> 
> Reason for revert: Still flaky :( https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/166516
> 
> Original change's description:
> > Reenable GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash
> > 
> > This was flaky, but should be fixed.
> > 
> > Bug: 878258
> > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
> > Change-Id: I3e037e41339ddff9d46ec0a7ca7679064a3322e0
> > Reviewed-on: https://chromium-review.googlesource.com/c/1285526
> > Reviewed-by: Kenneth Russell <kbr@chromium.org>
> > Commit-Queue: enne <enne@chromium.org>
> > Cr-Commit-Position: refs/heads/master@{#600813}
> 
> TBR=kbr@chromium.org,enne@chromium.org
> 
> Change-Id: I0dd172bd7db86982f9b1bf49e2a57c059958d348
> No-Presubmit: true
> No-Tree-Checks: true
> No-Try: true
> Bug: 878258
> Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
> Reviewed-on: https://chromium-review.googlesource.com/c/1290950
> Reviewed-by: Christian Dullweber <dullweber@chromium.org>
> Commit-Queue: Christian Dullweber <dullweber@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#601098}

TBR=kbr@chromium.org,enne@chromium.org,dullweber@chromium.org

Change-Id: If1ca9d7ae9ef2122150edfa9d965e0a9cd5c442a
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: 878258
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Reviewed-on: https://chromium-review.googlesource.com/c/1292431
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Kenneth Russell <kbr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#601268}
[modify] https://crrev.com/0d08508e73f47cde5cfb592e16b644a5bc7e9d44/content/test/gpu/gpu_tests/context_lost_expectations.py

This stack I think means that we gave a FrameSink to the compositor that could not be initialized. It's supposed to be initialized already since it's all one thread for the UI. Probably an OOPD issue?
OOP-D was enabled via fieldtrail_testing_config.json for Mac on August 27th, see https://crrev.com/c/1191122, so the timeline is right.
Blocking: 849302
Owner: fsam...@chromium.org
Thanks Dana and Kyle for figuring out the proximate root cause.

Fady, could you please take this since that change seems to have caused this instability? Or reassign to a more appropriate engineer? Thanks.

Owner: kylec...@chromium.org
I think kylechar@ or backer@ are more knowledgable here.
kylechar@ could you please look into this and make some progress? This test stresses context loss handling and it's crucial to make it reliable again. Thanks.

Another recent trybot failure:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/205513
https://chromium-swarm.appspot.com/task?id=41b7718ea7fcbb10&refresh=10&show_raw=1

Same stack trace as above.

This is affecting other Chromium developers' productivity. The flaky suppression hasn't worked; it looks like due to timing issues, the test fails three times in a row.

Status: Started (was: Assigned)
Sorry, didn't see this was reassigned to me. I'll try and reproduce but I had a conversion with danakj@ a while ago. If I remember correctly the issue is that ui::Compositor doesn't handle LayerTreeFrameSink::BindToClient() failing correctly. Either it shouldn't fail in the browser or ui::Compositor needs to handle the failure gracefully.
Right, it shouldn't fail. The GL context is bound before being given to the compositor. Anything else that can fail should be done ahead too.
I wasn't able to reproduce it locally but I think I know what's happening after looking at the code. We check that the worker context hasn't been lost at [1], create the AsyncLayerTreeFrameSink and pass it to ui::Compositor. The ui::Compositor LayerTreeHost calls LayerTreeFrameSink::BindToClient() which checks if the worker context is lost again at [2]. It is lost the second time we check it which means BindToClient() fails.

This is the actual problem, we can't guarantee that between those two checks the context hasn't been lost, but the reason it happens frequently is because the AsyncLayerTreeFrameSink has two different message pipes to the GPU process with OOP-D, mojom::CompositorFrameSink and the GPU channel. When the GPU process dies, the mojom::CompositorFrameSink sees the connection error to the GPU process first and triggers the AsyncLayerTreeFrameSink context loss code. GpuChannelHost hasn't seen it's connection error yet and new context providers get created using the existing GPU channel (for a dead GPU process). A new AsyncLayerTreeFrameSink is created and given those context providers, which is then given to ui::Compositor and LTFS::BindToClient() gets called. If GpuChannelHost sees the connection error at the right time then worker context will be lost at the second check.

We can handle context loss / GPU process restart in a smarter way with OOP-D to avoid this situation. We do a bunch of wasted work creating new LTFSs with dead context providers.

[1] https://cs.chromium.org/chromium/src/content/browser/compositor/viz_process_transport_factory.cc?l=411&rcl=163958795fc8e798f0959fd41628709f263898c7
[2] https://cs.chromium.org/chromium/src/cc/trees/layer_tree_frame_sink.cc?l=85&rcl=2975f65cb50278f165ad56bcf15afbaf77c15dcd
Looks legit. There's no reason for LTFS to check the worker context there as long as it will hear about the loss and respond to it from the observer in another call stack, the same way it would for the compositor context.
 Issue 919987  has been merged into this issue.
Could this bug please be prioritized? Flakes in these tests continue to be noticed by pixel wranglers and sheriffs; see  Issue 919987  for an example.

Note that in  issue 919987  I've seen ui::Compositor::DidFailToInitializeLayerTreeFrameSink() in GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash in builds
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%20Retina%20Debug%20%28NVIDIA%29/5113 and
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%20Retina%20Debug%20%28NVIDIA%29/5116

However, in
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%20Retina%20Debug%20%28NVIDIA%29/5097 it happened either in the first test ContextLost_WebGLBlockedAfterJSNavigation or even before any test is run.

I've got https://crrev.com/c/1403323 which implements the solution suggested by danakj@. I've never actually been able to reproduce the failure locally so I'll try running optional mac GPU test bot to see if it is still flaky.

I was unable to come up with a good way to solve the race around recreating browser AsyncLayerTreeFrameSinks. If https://crrev.com/c/1403323 works then it's probably not necessary.

Comment 58 by kylec...@chromium.org, Jan 17 (5 days ago)

Cc: sunn...@chromium.org
Project Member

Comment 59 by bugdroid1@chromium.org, Today (13 hours ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f7f20da9aff457b4ca6e42f670e0e30adc2db38d

commit f7f20da9aff457b4ca6e42f670e0e30adc2db38d
Author: kylechar <kylechar@chromium.org>
Date: Tue Jan 22 20:21:47 2019

Fix race binding LayerTreeFrameSink to client.

ui::Compositor expects that calling LayerTreeFrameSink::BindToClient()
will always be successful. However, BindToClient() can fail if the worker
context provider has encountered a GL error. Even if we check the worker
context provider hasn't encountered an error before passing it to
ui::Compositor, it's possible the error happens after the check but
before BindToClient() is called.

GpuCrash_GPUProcessCrashesExactlyOncePerVisitToAboutGpuCrash is failing
flakily on mac due to this. With OOP-D there are multiple message pipes
between the browser and GPU process which all get notified of the GPU
process crashing. This sets up the perfect conditions for the race to
occur.

Stop checking if the worker context provider has been lost in
BindToCurrentThread(). Instead, ensure that observers will always get
the OnContextLost() call even if AddObserver() was called after context
is lost.

We make OnContextLost() call happens in a new callstack to avoid
re-entrancy. This should be safe because the posted task has a reference
to context provider and we check that the observer is still observing
in the posted task.

Bug: 878258
Change-Id: If0db2fead55f86d86892db7a5dc257154590fe98
Reviewed-on: https://chromium-review.googlesource.com/c/1403323
Reviewed-by: Eric Karl <ericrk@chromium.org>
Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org>
Reviewed-by: danakj <danakj@chromium.org>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: kylechar <kylechar@chromium.org>
Cr-Commit-Position: refs/heads/master@{#624899}
[modify] https://crrev.com/f7f20da9aff457b4ca6e42f670e0e30adc2db38d/cc/trees/layer_tree_frame_sink.cc
[modify] https://crrev.com/f7f20da9aff457b4ca6e42f670e0e30adc2db38d/content/browser/renderer_host/compositor_impl_android.cc
[modify] https://crrev.com/f7f20da9aff457b4ca6e42f670e0e30adc2db38d/content/test/gpu/gpu_tests/context_lost_expectations.py
[modify] https://crrev.com/f7f20da9aff457b4ca6e42f670e0e30adc2db38d/ui/compositor/compositor.cc

Sign in to add a comment