context_lost_tests failed with FATAL:gpu_raster_buffer_provider.cc(192)] Check failed: sync_token.HasData() |
||||||
Issue descriptionhttps://build.chromium.org/p/chromium.gpu/builders/Mac%2010.10%20Retina%20Release%20%28AMD%29/builds/9301 GpuCrash.GPUProcessCrashesExactlyOnce failed with: [42461:21763:0606/201918:FATAL:gpu_raster_buffer_provider.cc(192)] Check failed: sync_token.HasData(). 0 Chromium Framework 0x00000001058edb73 _ZN4base5debug10StackTraceC1Ev + 19 1 Chromium Framework 0x000000010590c357 _ZN7logging10LogMessageD2Ev + 71 2 Chromium Framework 0x0000000106839a6b _ZN2cc23GpuRasterBufferProvider22PlaybackOnWorkerThreadEPNS_16ResourceProvider17ScopedWriteLockGLERKN3gpu9SyncTokenEbPKNS_12RasterSourceERKN3gfx4RectESE_yfRKNS8_16PlaybackSettingsE + 235 3 Chromium Framework 0x000000010683988c _ZN2cc23GpuRasterBufferProvider16RasterBufferImpl8PlaybackEPKNS_12RasterSourceERKN3gfx4RectES8_yfRKNS2_16PlaybackSettingsE + 108 4 Chromium Framework 0x0000000106889e1a _ZN2cc12_GLOBAL__N_114RasterTaskImpl17RunOnWorkerThreadEv + 426 5 Chromium Framework 0x000000010a91f469 _ZN7content21CategorizedWorkerPool33RunTaskInCategoryWithLockAcquiredEN2cc12TaskCategoryE + 137 6 Chromium Framework 0x000000010a91e53c _ZN7content21CategorizedWorkerPool3RunERKNSt3__16vectorIN2cc12TaskCategoryENS1_9allocatorIS4_EEEEPN4base17ConditionVariableE + 156 7 Chromium Framework 0x00000001059638dd _ZN4base12SimpleThread10ThreadMainEv + 125 8 Chromium Framework 0x000000010595f578 _ZN4base12_GLOBAL__N_110ThreadFuncEPv + 104 9 libsystem_pthread.dylib 0x00007fff8d20905a _pthread_body + 131 10 libsystem_pthread.dylib 0x00007fff8d208fd7 _pthread_body + 0 11 libsystem_pthread.dylib 0x00007fff8d2063ed thread_start + 13 This is GPU rasterization, correct? The code needs to be made more robust to lost contexts. Unclear how often this is happening at this point. I only saw one instance in 200 runs on this machine.
,
Jun 7 2016
To clarify: the crash in #1 is slightly different: [4052:1676:0607/090944:FATAL:one_copy_raster_buffer_provider.cc(243)] Check failed: sync_token.HasData(). Full stdout attached.
,
Jun 7 2016
After discussion with ericrk@, raising to P1 because it's showing up often enough on the waterfalls. Not sure whether this is affecting the CQ, but it's likely.
,
Jun 7 2016
After discussion with ericrk@ it looks like https://codereview.chromium.org/1951193002/ was the cause of these crashes. This was already reverted in https://codereview.chromium.org/2046033002/ . The build above, https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28New%20Intel%29/builds/511 , did not contain the revert, but the next job, https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28New%20Intel%29/builds/512 , does. Closing as WontFix. Let's be careful to make sure these sorts of flaky failures are sorted out before re-landing.
,
Jun 8 2016
,
Jun 9 2016
Re-opening this so that I can keep track of it. I haven't been able to reproduce this on my macbook despite hundreds of runs of the test. I've stared at the code for too long and can't see how this bug could ever happen. +piman@
,
Jun 9 2016
I suspect what happens is that GenUnverifiedSyncTokenCHROMIUM fails because IsFenceSyncFlushed fails because the channel is lost (like most lost context things, this is fundamentally racy hence the flakiness). So *RasterBufferProvider::OrderingBarrier doesn't properly generate a token and the assert triggers on the worker thread when we want to wait for it. Note that the condition doesn't cause actual problems in prod (the channel is lost anyway), so it may just be a matter of fine-tuning checks. My suggestion is maybe to abort tasks if the token is invalid in OrderingBarrier (after DCHECK'ing the context is actually lost) - no point in running the worker threads if the GPU process is gone, might as well abort early and go through recovery as normal.
,
Jun 16 2016
Relanded the CL (https://codereview.chromium.org/1951193002/) - we prevent this DCHECK from triggering by never scheduling tasks if we detect that the context is lost while generating the sync token (RasterBufferProvider::OrderingBarrier). |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by kbr@chromium.org
, Jun 7 2016