glGenSyncToken Context Lost on Large Multi-Canvas Resize
Reported by
mattsmcc...@gmail.com,
Oct 25 2016
|
||||||||
Issue description
UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2900.0 Safari/537.36
Steps to reproduce the problem:
1. Open the html page. There should be 2 grey canvases, each with a red webgl-drawn triangle moving about the container.
2. The containers change size in a 100ms interval by 10px-by-10px increments between 2000px-by-2000px and 3000px-by-3000px.
3. After a minute or two on my system, the page will crash
What is the expected behavior?
Canvases will resize indefinitely without loosing their WebGL contexts.
What went wrong?
When the page crashes a glGenSyncTokenCHROMIUM warning is raised, followed by a CONTEXT_LOST_WEBGL warning for each canvas.
After reloading the browser the "Rats! WebGL hit a snag." info bar is displayed, and all calls to <canvas>.getContext('webgl') return null until the "Rats!" Reload button is selected.
Did this work before? No
Does this work in other browsers? Yes
Chrome version: 56.0.2900.0 Channel: canary
OS Version: 10.0
Flash Version: Shockwave Flash 23.0 r0
* The 1st line of the attached console log tells me that on the most recent interval, I attempted to set the canvas size to 2770px-by-2770px, but the canvas was instead resized from 2769px-by-2769px to 0px-by-0px.
* When performing this test on an empty or single canvas, I only get the context lost warning, not the genSyncToken warning.
* The use of the term sync leads me to believe there is some threading race condition check that is not being met, rather than a resource overflow exception.
* The chrome://gpu page for my test environment can be found here: https://gist.github.com/mmccartn/87d6ac9001fb5bd8d0372e44e05b855d
,
Oct 25 2016
I wonder if there's a recently-introduced resource leak in the communication of the DrawingBuffer's texture to the compositor. The glGenSyncTokenCHROMIUM warning/error is happening because the GPU process is crashing, and some code in the renderer process isn't responding well to that.
,
Oct 25 2016
Actually, I'm able to reproduce it on stable 54.0.2840.71 as well.
,
Oct 26 2016
Stable Windows 48.0.2564.109 (64-bit) reproduces the same issue, except the glGenSyncTokenCHROMIUM warning is missing. Thanks for confirming my suspicion that the GPU process is crashing. When the page crashes, I see that the memory usage of the GPU process immediately drops to 0, but it retains a PID, which lead me to believe it was still alive in some sense.
,
Nov 11 2016
Do you need any more info to help get this assigned?
,
Nov 14 2016
Another version of the test case which uses requestAnimationFrame instead of setInterval would be helpful. We are swamped and haven't had time to investigate this yet, but the fact that this animates on a fixed time interval, rather than allowing the browser to provide feedback if it can't keep up, is a red flag.
,
Nov 15 2016
Xida, do you think you could take a look at this? I have a feeling it might be related to recent work in DrawingBuffer, and might indicate a leak of GPU resources.
,
Nov 16 2016
Investigating.
,
Nov 16 2016
Some initial investigation: I ran the test page on Linux ToT, it crashes in 5 seconds. I looked at the task manager and GPU process uses 60MB, and the renderer process uses 18MB, it should not be OOM crash. Ken, Kai, could either one of you confirm the above behavior? Then I tried to see if we have a good build on this. I went back to chrome 49.0.2574.0 and it has the same behavior (crashes in about 5 seconds). I will have a debug build and attach stacktrace to see what's going on.
,
Nov 16 2016
It doesn't crash when I attach a debugger to it. The GPU process is at ~700 for 10 mins now and there is no crash...
,
Nov 16 2016
A little bit more updates: I observed the console output on both ToT and chrome 49, seem to have the same behavior, both canvases size were set to 0*0 when the time it crashes, when we refresh the page, gl instance is null.
,
Nov 16 2016
OK, I think I know what's going on. When running the html file, the function resizeContainers gets triggered after 1000ms, when running this function, somewhere there is a webgl context lost event (indicated by the console output), and then the gl.drawingBufferWidth and gl.drawingBuffer height becomes 0. However, the animate() function just keeps running, and gl is null already, then it crashes. When refresh page, the code path goes into here: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/modules/webgl/WebGLRenderingContextBase.cpp?sq=package:chromium&rcl=1479284732&l=656 and the gl context never successfully created. Ken, does the above analysis make sense? This doesn't seem to have to do with any recent change as I can repro it with chrome 49. I think the reason is that the webgl context lost event is not handled properly in the script.
,
Nov 16 2016
Thanks Xida for confirming that the test fails with Chrome 49. If the bug is that old it's not related to recent DrawingBuffer changes. I suspect that we're leaking textures presented to the compositor because this test case resizes the canvas on a fixed time interval rather than using requestAnimationFrame. We should really figure out why the textures aren't being reclaimed promptly.
,
Nov 17 2017
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. If you change it back, also remove the "Hotlist-Recharge-Cold" label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Nov 17 2017
,
Jul 25
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by kainino@chromium.org
, Oct 25 2016Status: Untriaged (was: Unconfirmed)