oes-vertex-array-object.html flaky on ANGLE's GL backend |
|||||||
Issue descriptionWebGL conformance test conformance/extensions/oes-vertex-array-object.html became flaky just recently. Here are a few example failures: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel/13944 https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel/13947 On the flakiness dashboard it looks like this may have started after https://chromium-review.googlesource.com/c/chromium/src/+/1389615 landed.
,
Jan 9
All recent failures: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel/13950 https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel/13947 https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel/13944 https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_optional_gpu_tests_rel/13943 Failure mode: [87/469] gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_extensions_oes_vertex_array_object failed unexpectedly 6.0965s: Traceback (most recent call last): _RunGpuTest at content/test/gpu/gpu_tests/gpu_integration_test.py:155 self.RunActualGpuTest(url, *args) RunActualGpuTest at content/test/gpu/gpu_tests/webgl_conformance_integration_test.py:199 getattr(self, test_name)(test_path, *args[1:]) _RunConformanceTest at content/test/gpu/gpu_tests/webgl_conformance_integration_test.py:288 self._CheckTestCompletion() _CheckTestCompletion at content/test/gpu/gpu_tests/webgl_conformance_integration_test.py:284 self.fail(self._WebGLTestMessages(self.tab)) fail at .swarming_module/lib/python2.7/unittest/case.py:410 raise self.failureException(msg) AssertionError: Drawing with VAO that has the color array disabled should pass at (25, 25) expected: 128,128,128,255 was 0,0,0,255 FAIL Drawing with VAO that has the color array disabled should pass at (25, 25) expected: 128,128,128,255 was 0,0,0,255 Drawing with VAO that has the color array disabled should pass at (20, 20) expected: 128,128,128,255 was 0,0,0,255 FAIL Drawing with VAO that has the color array disabled should pass at (20, 20) expected: 128,128,128,255 was 0,0,0,255 Drawing with VAO that has the color array disabled should pass at (15, 15) expected: 128,128,128,255 was 0,0,0,255 FAIL Drawing with VAO that has the color array disabled should pass at (15, 15) expected: 128,128,128,255 was 0,0,0,255 Drawing with VAO that has the color array disabled should pass at (10, 10) expected: 128,128,128,255 was 0,0,0,255 FAIL Drawing with VAO that has the color array disabled should pass at (10, 10) expected: 128,128,128,255 was 0,0,0,255 The failures are happening in webgl_conformance_gl_passthrough_tests on both NVIDIA and Intel GPUs. geofflang@ since these failures are intermittent I have a feeling that there's a bug in ANGLE's state management on the OpenGL backend. I actually saw a flake of this test in the tryjobs for my CL but thought it might be unrelated. How could we get to the bottom of this? I wasn't immediately able to reproduce locally, even just trying to rerun the failing shard.
,
Jan 9
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c5b459baa54d6f183d1d4c5b50fcb388035cd15f commit c5b459baa54d6f183d1d4c5b50fcb388035cd15f Author: Kenneth Russell <kbr@chromium.org> Date: Wed Jan 09 01:20:08 2019 Suppress oes-vertex-array-object on Linux/passthrough. conformance/extensions/oes-vertex-array-object.html is occasionally rendering incorrect results for one sub-test on this configuration. Bug: 920033 No-Try: True Change-Id: I6db494ed4aff0f259e76c76c31048ad5e4097c7d Reviewed-on: https://chromium-review.googlesource.com/c/1400818 Reviewed-by: James Darpinian <jdarpinian@chromium.org> Commit-Queue: Kenneth Russell <kbr@chromium.org> Cr-Commit-Position: refs/heads/master@{#620978} [modify] https://crrev.com/c5b459baa54d6f183d1d4c5b50fcb388035cd15f/content/test/gpu/gpu_tests/webgl_conformance_expectations.py
,
Jan 9
Suppression landed. Downgrading to P2 bug.
,
Jan 9
It will be difficult without a reliable repro, if it's a specific sub-section of the test that fails we may be able to analyze the code and try to make a repro or speculative fix. It failing on Linux-only may also be a clue.
,
Jan 9
,
Jan 9
The test fails intermittently with: --use-gl=angle --use-angle=gl with or without the passthrough command decoder, on both Windows and Linux. (NVIDIA GPUs in both, but this has been seen on the Intel bots too, so I don't think it's an NVIDIA driver bug.) To reproduce, just load sdk/tests/conformance/extensions/oes-vertex-array-object.html and reload the browser over and over again. At some point, a few of the tests will fail. Have seen ~3 different places in the test where the failures happened. Doesn't happen with the native GL driver, nor with ANGLE's D3D backend. Have looked through ContextGL, StateManagerGL and VertexArrayGL looking for places where dirty bits might not be being propagated correctly to the driver. In StateManagerGL::syncState, in the gl::State::DIRTY_BIT_PROGRAM_EXECUTABLE case, it looked strange that propagateProgramToVAO was only being called if program was either null, or didn't have a compute shader attached - but now I realize that's the common case. Couldn't see any other obvious missing state syncing in VertexArrayGL::syncClientSideData or syncDrawState. The most common failure mode is: FAIL Drawing with VAO that has the color array disabled should pass at (25, 25) expected: 128,128,128,255 was 0,0,0,255 FAIL Drawing with VAO that has the color array disabled should pass at (20, 20) expected: 128,128,128,255 was 0,0,0,255 FAIL Drawing with VAO that has the color array disabled should pass at (15, 15) expected: 128,128,128,255 was 0,0,0,255 FAIL Drawing with VAO that has the color array disabled should pass at (10, 10) expected: 128,128,128,255 was 0,0,0,255 PASS Drawing with VAO that has the color array disabled should pass PASS Drawing with VAO that has the color array disabled should pass The attached version of the conformance test is cut down a bit. The failure only ever seems to happen if at least runUnboundDeleteTests() is included, and more often the more tests are included. I'm wondering at this point if the bug could be related to the deletion of VAOs; could that leave ANGLE's context in a broken state? The check "if (mVertexArray && mVertexArray->id() == vertexArray)" in State::removeVertexArrayBinding looked a little suspicious - what if the previously-deleted VAO's name was returned from a subsequent allocation? I tried taking out handle reuse in HandleAllocator::allocate() - no change. Geoff, Jamie, could I ask for some help tracking this down?
,
Jan 10
Yuly, interested in looking at this? Looking at the code that Ken mentioned would be good. We should also try marking the vertex array as dirty every draw call to see if it's a synchronization edge case.
,
Today
(7 hours ago)
Could we please make progress on this bug? It sounds like a general bug in ANGLE's OpenGL backend and we can't advocate to partners to pick up that backend until it's 100% reliable from our standpoint. Thanks.
,
Today
(6 hours ago)
I reduced this a bit further (attached). Note that you need to actually restart the browser between trials to repro the flake. Could be a dirty bits thing. I'll try a bisect.
,
Today
(5 hours ago)
I went back to Wed Nov 21 2018 https://crrev.com/c/1352591 and the test was still flaking. I then was having trouble building the code because of a breaking step in copying the compiler DLL. We might have to just diagnose the test rather than finding the culprit. But bisecting Chrome instead of ANGLE or using other tricks might work as well. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by kbr@chromium.org
, Jan 9Labels: -Type-Bug Type-Bug-Regression