Windows 7 x64 Debug GPU FYI bot failing WebglConformance.conformance_uniforms_gl_uniform_arrays |
|||
Issue descriptionFirst failing build: https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20x64%20Debug%20%28NVIDIA%29/builds/843 ANGLE revision stayed the same so it's unlikely to be ANGLE. Regression range: https://chromium.googlesource.com/chromium/src/+log/624c53d136d76327615167859e5e015515286d99..d4dcbd342dd54f55383daf8bc44b2c9d97fe0d0b Possibly related to the VS2015 change? There seems to be a timeout in this test, possibly an ASSERT dialog. I have not tried to repro yet.
,
Mar 10 2016
I will test. If you want to test locally you can go: set GYP_MSVS_VERSION=2015 gclient runhooks ninja ...
,
Mar 10 2016
Any advice on how to build/run this test locally? Also, I see a lot of GL_ERROR output - I assume that is normal?
,
Mar 10 2016
Found it: python ../content/test/gpu/run_gpu_test.py webgl_conformance --show-stdout --browser=debug_x64 -v "--extra-browser-args=--enable-logging=stderr --js-flags=--expose-gc" --output-dir c:\users\brucedawson\documents\tmpdkcls2telemetry --output-format=json
,
Mar 10 2016
Running webgl_conformance seems to take tens of minutes. Is there a way to just run WebglConformance.conformance_uniforms_gl_uniform_arrays? Without that it's pretty unwieldy trying to understand the problem.
,
Mar 10 2016
The reason I couldn't find gl_uniform_arrays in a code search is because it is listed as gl-uniform-arrays. Once I found it I modified three files in third_party\webgl\src. In the first two I deleted all listed files except for the reference to the next file. In the last one I deleted all listed tests except for gl-uniform-arrays.html. sdk/tests/00_test_list.txt sdk/tests/conformance/00_test_list.txt sdk/tests/conformance/uniforms/00_test_list.txt Now I can repro the failure in a timely manner. It would seem helpful to have and document a less clunky way of running a single sub-test.
,
Mar 10 2016
The test seems to be running normally (alternating printing out vertex shader and pixel shader on the web page) and then times out before it completes - no assert dialog: 1) Does anybody know where the timeout is set so I can try increasing it? It appears to be a network timeout. 2) Does anybody know how long the VS 2013 version takes to run this test? I'm currently building a VS 2013 version to compare.
,
Mar 10 2016
Apologies for the difficulty running individual tests. The way to do this is to pass the --story-filter argument to the harness: run_gpu_test.py webgl_conformance --story-filter=gl_uniform_arrays The timeouts are probably deep inside Telemetry; if you can provide a stack trace we can figure out how to hardwire them to some larger value. The VS 2013 version takes about 20 seconds to run the test.
,
Mar 10 2016
I'm not sure whether the VS 2015 version is running much slower, or if it is iterating more times than it should. Making progress...
The timeout callstack is shown below. Temporarily hardwiring the timeout to a larger value for test purposes could be handy.
File "third_party\catapult\telemetry\telemetry\internal\story_runner.py", line 84, in _RunStoryAndProcessErrorIfNeeded
state.RunStory(results)
File "content\test\gpu\gpu_tests\gpu_test_base.py", line 122, in RunStory
RunStoryWithRetries(DesktopGpuSharedPageState, self, results)
File "content\test\gpu\gpu_tests\gpu_test_base.py", line 72, in RunStoryWithRetries
super(cls, shared_page_state).RunStory(results)
File "third_party\catapult\telemetry\telemetry\page\shared_page_state.py", line 323, in RunStory
self._current_page.Run(self)
File "third_party\catapult\telemetry\telemetry\page\__init__.py", line 84, in Run
shared_state.page_test.RunNavigateSteps(self, current_tab)
File "third_party\catapult\telemetry\telemetry\page\page_test.py", line 182, in RunNavigateSteps
page.RunNavigateSteps(action_runner)
File "content\test\gpu\gpu_tests\webgl_conformance.py", line 143, in RunNavigateSteps
'webglTestHarness._finished', timeout_in_seconds=180)
File "third_party\catapult\telemetry\telemetry\page\action_runner.py", line 186, in WaitForJavaScriptCondition
self._tab.WaitForJavaScriptExpression(condition, timeout_in_seconds)
File "third_party\catapult\telemetry\telemetry\internal\browser\web_contents.py", line 103, in WaitForJavaScriptExpression
self._inspector_backend.GetCurrentConsoleOutputBuffer())
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_backend.py", line 35, in inner
inspector_backend._ConvertExceptionFromInspectorWebsocket(e)
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_backend.py", line 32, in inner
return func(inspector_backend, *args, **kwargs)
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_backend.py", line 177, in GetCurrentConsoleOutputBuffer
return self._console.GetCurrentConsoleOutputBuffer(timeout)
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_console.py", line 37, in GetCurrentConsoleOutputBuffer
self._EnableConsoleOutputStream()
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_console.py", line 50, in _EnableConsoleOutputStream
self._inspector_websocket.SyncRequest({'method': 'Console.enable'})
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_websocket.py", line 110, in SyncRequest
res = self._Receive(timeout)
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_websocket.py", line 149, in _Receive
data = self._socket.recv()
File "third_party\catapult\telemetry\third_party\websocket-client\websocket.py", line 596, in recv
opcode, data = self.recv_data()
File "third_party\catapult\telemetry\third_party\websocket-client\websocket.py", line 606, in recv_data
frame = self.recv_frame()
File "third_party\catapult\telemetry\third_party\websocket-client\websocket.py", line 637, in recv_frame
self._frame_header = self._recv_strict(2)
File "third_party\catapult\telemetry\third_party\websocket-client\websocket.py", line 746, in _recv_strict
bytes = self._recv(shortage)
File "third_party\catapult\telemetry\third_party\websocket-client\websocket.py", line 732, in _recv
raise WebSocketTimeoutException(e.message)
TimeoutException:
,
Mar 10 2016
Hmhh, this seems really messed up. Somehow we got a first timeout when waiting for the js condition to be True, then we get a WebSocketTimeoutException timeout when telemetry try to contact devtool to get log console.
,
Mar 10 2016
Look up the stack: you can hardwire it probably at this stack frame or above:
File "third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\inspector_websocket.py", line 110, in SyncRequest
res = self._Receive(timeout)
,
Mar 10 2016
Ken: the first timeout when waiting for the js condition is "File "content\test\gpu\gpu_tests\webgl_conformance.py", line 143, in RunNavigateSteps
'webglTestHarness._finished', timeout_in_seconds=180)"
,
Mar 10 2016
Thanks. Bruce, please try increasing that timeout.
,
Mar 11 2016
The VS 2015 debug version is slightly slower than the VS 2013 debug version. On my Z840 machine the VS 2013 64-bit debug version completes the 18 (I think) iterations of this test in about 200 s. When I raised the timeout (thanks!) the VS 2015 64-bit debug test passed in 268.7 s. A 34% slowdown is worth investigating, but since it is in debug builds it is not a blocker. I opened crbug.com/593983 to track this. It's curious that all of the runtimes are higher than the timeout, which makes it hard to know what to set the timeout to. The VS 2015 test passed with the timeout set to 240, but 300 would give more breathing room. crrev.com/1785713003 raises the timeout to 300. Let me know what you think.
,
Mar 11 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/87908413585f8cc838fe2eae1581dbcc767ca49f commit 87908413585f8cc838fe2eae1581dbcc767ca49f Author: brucedawson <brucedawson@chromium.org> Date: Fri Mar 11 19:48:05 2016 Increase GPU test timeout to avoid fragility VS 2015 64-bit debug builds were hitting tests timeouts on WebglConformance.conformance_uniforms_gl_uniform_arrays. It turns out that VS 2015 64-bit debug builds are, in this scenario at least, about 34% slower than VS 2013's debug builds. This means that VS 2013 is also close to hitting this timeout, at least on slower machines. This change increases the timeout from 180 to 300. Bug 593983 covers investigating why VS 2015's builds are slower, but since this issue only occurs in debug builds it is not an immediate blocker. TBR=kbr@chromium.org BUG= 593710 , 440500 , 593983 Review URL: https://codereview.chromium.org/1785713003 Cr-Commit-Position: refs/heads/master@{#380708} [modify] https://crrev.com/87908413585f8cc838fe2eae1581dbcc767ca49f/content/test/gpu/gpu_tests/webgl_conformance.py
,
Mar 11 2016
Fixed. Investigating *why* VS 2015 debug builds are slower is tracked in crbug.com/593983 .
,
Mar 12 2016
I lowered the test iteration count to two and grabbed ETW traces (tinyurl.com/etwcentral) of the VS 2013 and VS 2015 builds (both 64-bit debug) running the tests. When grouped by Randomascii Exclusive (module and function) I noticed that gles2_utils.dll used almost twice as much CPU time in the VS 2015 case - see attached screen shot of the two profiles, with VS 2013 at the top. Looking at the top functions that the four top VS 2013 functions map neatly to four identical functions in the VS 2015 profile. However the VS 2015 profile also shows nine other functions using equal or larger amounts of CPU time - functions that simply do not show up in the VS 2013 profile. Either these functions are new or else they were previously inlined. It seems most likely that they were previously inlined. Since most of them are in the std::string class it's not clear if there is anything we can do to improve them.
,
Mar 12 2016
Thanks for figuring this out Bruce. Sorry I couldn't be of any help!
,
Mar 12 2016
One thought: the --story-filter arguments as a way to filter down to specific sub-tests is not very discoverable. Early on I ran this: python ../content/test/gpu/run_gpu_test.py and it gave me a list of the top-level tests. It seems like this would be an ideal place to mention --story-filter. Thoughts? Should I implement this or file a bug or ???
,
Mar 14 2016
Bruce: sorry for the difficulty -- but the --story-filter option is listed under run_gpu_test.py help run This harness is likely going to change soon, so I don't think it's worth implementing a more specialized help function. |
|||
►
Sign in to add a comment |
|||
Comment 1 by jmad...@chromium.org
, Mar 10 2016Owner: brucedaw...@chromium.org
Status: Assigned (was: Untriaged)