Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 9 users
Status: WontFix
Owner:
Closed: Jun 2013
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug



Sign in to add a comment
Long WebGL shader compile/link time
Reported by pjcozzi...@gmail.com, Feb 7 2012 Back to list
A complex WebGL fragment shader takes ~5,300 ms to link in WebGL 
(on Linux and Windows without ANGLE).  With ANGLE, the browser tab hangs.  In desktop OpenGL, the whole 
program compiles and links almost instantaneously.

Ken Russell, kbr@chromium.org, has a simple test case to reproduce.  The console window shows the time it takes to call linkProgram in ms.

By commenting out parts of the shader, it looks like functions with 
loops in them create quite a compile/link performance hit.  Some 
functions have nested for loops with four iterations.  I found similar 
performance both with and without shader validation in Chrome, 
assuming  "--disable-glsl-translator --use-gl=desktop" is still valid 
in 18.0.1.

Also see the WebGL Dev List discussion:  http://groups.google.com/group/webgl-dev-list/browse_thread/thread/7ea743969189af53

Perhaps related to ANGLE bug: http://code.google.com/p/angleproject/issues/detail?id=259
 
Comment 1 by kbr@chromium.org, Feb 7 2012
Cc: kbr@chromium.org zmo@chromium.org
Labels: -Area-Undefined Area-Internals Internals-Graphics Feature-GPU-Internals
Here's Patrick's test case. I haven't run it yet.

Comment 2 by kbr@chromium.org, Feb 7 2012
Cc: vangelis@chromium.org
The test case is too large. Patrick, can you please invest the time to reduce it to something smaller?

A cursory glance through the shader indicates to me that this might be another instance of http://code.google.com/p/angleproject/issues/detail?id=146 , which is a known problem in the GLSL -> HLSL shader translation phase needing work.

Comment 3 by kbr@chromium.org, Feb 7 2012
Status: Untriaged
I've confirmed that the problem occurs in Chrome on Linux (TOT Chromium, 19.0.1031.0 r120523) and seems to disappear when --disable-glsl-translator is used to sidestep ANGLE's shader translator. The problem should be easier to isolate given that. Here's a version of the test case which works on Linux with and without that flag.

Comment 4 by kbr@chromium.org, Feb 7 2012
Labels: -OS-Windows OS-All
Comment 5 by kbr@chromium.org, Feb 7 2012
I didn't realize that the test case provided by the submitter wasn't intended to be attached to the bug report. Anyone investigating this, please contact me for the test case.

Comment 6 by zmo@chromium.org, Feb 7 2012
Owner: zmo@chromium.org
Status: Assigned
I'll have a look.  Ken, can you email me the test case?
Comment 7 by kbr@chromium.org, Feb 7 2012
Emailed separately.

I'm seeing very strange behavior where the same version of the web browser sometimes demonstrates the long link times and sometimes doesn't. Instrumenting the code with measurements made the problem disappear. Perhaps this is in fact a memory stomp in the shader translator which mysteriously causes steps in the shader translator to run differently and longer? Seems very unlikely. Another possibility might be a driver level issue, which also seems unlikely.

Comment 8 by kbr@chromium.org, Feb 7 2012
We should run the test case with ASAN to catch memory stomps definitively.

I also submitted this issue to Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=725467

I tested this again on an AMD Radeon HD 5870.  The tab hangs in both Chrome Beta (18) and Canary (19).  The Chrome Task Manager shows the GPU Process pegged at 50 (dual-core).
Project Member Comment 11 by bugdroid1@chromium.org, Mar 10 2013
Labels: -Area-Internals -Internals-Graphics -Feature-GPU-Internals Cr-Internals-GPU-Internals Cr-Internals-Graphics Cr-Internals
Comment 12 by psula...@gmail.com, Mar 22 2013
Anyone come to a conclusion with this issue?  I noticed that a simple 5x5 convolution kernel will take a huge amount of time to compile/link, compared to a version I loop unroll myself.  If I keep on increasing the size of the kernel, the problem grows exponentially.  It is making it impossible to use large kernels (like 20x20) with webgl.  Compile/link takes more than several seconds when ANGLE is used and 200 ms if native OpenGl is used.  It seems that the ANGLE should not loop unrolls if the loop size is large (4000) - standard practice with compilers.
Comment 13 by kbr@chromium.org, Mar 22 2013
psulatyc: please provide a complete, self-contained test case, as well as about:gpu output. Several changes have been made to the shader translator, and we are still looking for test cases which could be used to file bugs against Microsoft's HLSL shader compiler, believed at the present time to be the cause of the problem.

Comment 14 by psula...@gmail.com, Mar 22 2013
Thanks for the immediate response!  I'll see if I can do that this weekend.  I have a real simple shader that highlights the problem.  I just have to create some simple infrastructure around it so that you can have a self-contained case.
The attached WebGL program also shows a timeout/"something went wrong" around linkProgram.  I suspect this is the same issue.

Chrome Version 27.0.1453.116 m, 64-bit Win7

The fragment shader traverses a data structure with a 400-iteration outer loop and 8-iteration inner loop.  Works fine on the same GPU in Windows FF.  Also works fine in Linux FF, Linux Chrome, and desktop Safari on MacOSX.  Page loads and displays in just a few seconds if I start chrome with "--use-gl=desktop".

Error logged is: "[0624/140748:ERROR:gpu_watchdog_thread.cc(209)] The GPU process hung. Terminating after 10000 ms."

Sometimes I get failure at getProgramParameter, and sometimes I get the "Aw Snap!" tab.

To see the link not timeout, try setting "max_bvh_iterations" in the shader to something small, like 50 (or even 10).  If your testing machine is quite fast, the current setting of "400" for max iterations may not timeout, so you could try pumping it up to 800 or higher.

I've attached the result of about:gpu and a .zip file containing the .html and an associated .js support file.

As a bonus, if the page loads correctly you should see a nice caffeine molecule.
link_timeout.zip
8.2 KB Download
gpu.htm
24.5 KB View Download
Comment 16 by kbr@chromium.org, Jun 26 2013
Cc: bajones@chromium.org
bajones / zmo: could one of you please try the test case in Comment 15?

The renderer process definitely shouldn't crash if the context is lost.

Re: comment 15 - Chrome "Version 29.0.1547.2 canary" also shows problem
Comment 18 by zmo@chromium.org, Jun 26 2013
Yeah, I can reproduce the watchdog timeout gpu crash and the renderer crash.

Thanks for the test case.  I'll do some digging.
Comment 19 by zmo@chromium.org, Jun 27 2013
"exception on device is detected", the browser crashed ... :(
Comment 20 by zmo@chromium.org, Jun 27 2013
Cc: apatrick@chromium.org
So the browser crash is a aura thing, no need to worry for now.

I try to catch the renderer crash on a debug build, somehow unsuccessful.

On debug build, watchdog thread is not on, so it won't kill gpu process while linking takes a long time.

However, with max_bvh_iterations = 300 or more, it links fine but rendering is pure red...

Does we do something different with debug  and release, shader compile/link related?
Comment 21 by zmo@chromium.org, Jun 27 2013
Cc: geoffl...@chromium.org
Status: WontFix
Tested with Angle D11, and this slow compiling is gone.

Per advised by geofflang, this is a DX9 issue and since we are moving to D11 on Chrome 30 soon, I will mark this as WontFix.
Just to clarify, we will continue to use DX9 on Windows XP and Windows Vista even after we use DX11 on other Windows versions.
Comment 23 Deleted
Components: -Internals>Graphics Internals>GPU
Moving old issues out of Internal>Graphics to delete this obsolete component ( crbug.com/685425  for details)
Sign in to add a comment