New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 722880 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug

Blocked on:
issue 721783

Blocking:
issue 82385



Sign in to add a comment

webgl_conformance_tests on NVIDIA GPU on Windows (with patch) on Windows-2008ServerR2-SP1 fail on 32-bit clang builds

Project Member Reported by thakis@chromium.org, May 16 2017

Issue description

https://codereview.chromium.org/2870543003/ reliably fails win_chromium_rel_ng, with webgl_conformance_tests failing every time. The output is a bit hard to read, but I think this is one of the errors:

[4/859] gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_attribs_gl_disabled_vertex_attrib failed unexpectedly 13.9510s:
  
  Traceback (most recent call last):
    _RunGpuTest at content\test\gpu\gpu_tests\gpu_integration_test.py:73
      self.RunActualGpuTest(url, *args)
    RunActualGpuTest at content\test\gpu\gpu_tests\webgl_conformance_integration_test.py:203
      getattr(self, test_name)(test_path, *args[1:])
    _RunConformanceTest at content\test\gpu\gpu_tests\webgl_conformance_integration_test.py:217
      self._CheckTestCompletion()
    _CheckTestCompletion at content\test\gpu\gpu_tests\webgl_conformance_integration_test.py:213
      self.fail(self._WebGLTestMessages(self.tab))
    fail at c:\b\depot_tools\python276_bin\lib\unittest\case.py:412
      raise self.failureException(msg)
  AssertionError: should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  FAIL should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  *** Error compiling VERTEX_SHADER '[object WebGLShader]':
  *** Error compiling FRAGMENT_SHADER '[object WebGLShader]':
  Error in compiling shader
  should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  FAIL should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  *** Error compiling VERTEX_SHADER '[object WebGLShader]':
  *** Error compiling FRAGMENT_SHADER '[object WebGLShader]':
  Error in compiling shader
  should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  FAIL should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  *** Error compiling VERTEX_SHADER '[object WebGLShader]':
  *** Error compiling FRAGMENT_SHADER '[object WebGLShader]':
  Error in compiling shader
  should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  FAIL should be green
  at (0, 0) expected: 0,255,0,255 was 0,0,0,0
  *** Error compiling VERTEX_SHADER '[object WebGLShader]':
  *** Error compiling FRAGMENT_SHADER '[object WebGLShader]':
  Error in compiling shader
  should be green


kbr@, can you describe briefly what this test tests for? Can you think of anything compiler-specific in these tests? (Do these tests happen to use deqp and could this be related to  issue 722345 ? Since that one's on linux and this here is windows, probably not?) We don't run these tests on any current clang/win bots, could we just add them, or do they require special setup on a bot?
 

Comment 1 by kbr@chromium.org, May 16 2017

Cc: geoffl...@chromium.org jmad...@chromium.org
Looking more deeply into the logs, for example:
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.win%2Fwin_chromium_rel_ng%2F445423%2F%2B%2Frecipes%2Fsteps%2Fwebgl_conformance_tests_on_NVIDIA_GPU_on_Windows__with_patch__on_Windows-2008ServerR2-SP1%2F0%2Fstdout

The GPU process is crashing in multiple places. The WebGL context is consequently being lost and this is the reason the tests are failing.

A couple of the symbolized backtraces from test failures:

[3/859] gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_attribs_gl_bindAttribLocation_repeated passed 0.1360s
Backtrace:
	(No symbol) [0x659F54F3]
	(No symbol) [0x65AF27B7]
	(No symbol) [0x65A410F6]
	(No symbol) [0x65A3845F]
	(No symbol) [0x659FEC35]
	scoped_refptr<net::X509Certificate>::operator-> [0x6754459D+35]
	gpu::gles2::GLES2DecoderImpl::DoDrawElements [0x68D2E029+1051]
	gpu::gles2::GLES2DecoderImpl::HandleDrawElements [0x68D0712E+32]
	gpu::gles2::GLES2DecoderImpl::DoCommandsImpl<0> [0x68D21D01+225]
	gpu::CommandParser::ProcessCommands [0x68D4FBFD+65]
	gpu::CommandExecutor::PutChanged [0x68CFDB45+487]
	gpu::CommandBufferService::Flush [0x68CFED24+42]
	gpu::GpuCommandBufferStub::OnAsyncFlush [0x67C93AFE+440]
	IPC::MessageT<GpuCommandBufferMsg_AsyncFlush_Meta,std::tuple<int,unsigned int,std::vector<ui::LatencyInfo,std::allocator<ui::LatencyInfo> >,std::vector<gpu::SyncToken,std::allocator<gpu::SyncToken> > >,void>::Dispatch<gpu::GpuCommandBufferStub,gpu::GpuCom [0x67C93911+235]
	gpu::GpuCommandBufferStub::OnMessageReceived [0x67C929D6+2558]
	gpu::GpuChannel::HandleMessageHelper [0x67C896DF+45]
	gpu::GpuChannel::HandleMessageOnQueue [0x67C84A92+292]
	base::Callback<void __cdecl(void),0,0>::Run [0x668348D5+41]
	base::debug::TaskAnnotator::RunTask [0x67435BB4+452]



[4/859] gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_attribs_gl_disabled_vertex_attrib failed unexpectedly 24.1340s:
...
  Symbolized minidump:
  Last event: cf4.8a8: Access violation - code c0000005 (first/second chance not available)
    debugger time: Mon May 15 21:57:25.116 2017 (UTC - 7:00)
  ChildEBP RetAddr  Args to Child              
  0058e2bc 65af27b7 0058e61c 07a97e70 00009274 libglesv2!gl::ComputeVertexAttributeStride+0x10
  0058e30c 65a410f6 0058e61c 07a97e70 00009274 libglesv2!rx::StreamingVertexBufferInterface::storeDynamicAttribute+0x109
  0058e364 65a3845f 0058e61c 07a23624 03cff2e4 libglesv2!rx::VertexDataManager::storeCurrentValue+0xd4
  0058e3b4 659fec35 0058e61c 07a94ce0 00f30b78 libglesv2!rx::StateManager11::updateCurrentValueAttribs+0xd9
  0058e494 65a07838 0058e61c 07a94ce0 00000004 libglesv2!rx::Renderer11::applyVertexBuffer+0x7d
  0058e5d0 65a7f062 0058e61c 07a25120 00000004 libglesv2!rx::Renderer11::genericDrawElements+0x182
  0058e5fc 6599e3f6 0058e61c 00000004 00000006 libglesv2!rx::Context11::drawElements+0x24
  *** WARNING: Unable to verify checksum for chrome_child.dll
  0058e638 68d2e029 00000004 00000006 00001403 libglesv2!gl::Context::drawElements+0x7e
  0058e67c 68d0712e 69fd35b8 00000000 00000004 chrome_child!gpu::gles2::GLES2DecoderImpl::DoDrawElements+0x41b
  0058e6a0 68d21d01 00000000 081806c4 659e219a chrome_child!gpu::gles2::GLES2DecoderImpl::HandleDrawElements+0x20
  0058e790 68d4fbfd 00000014 081805f8 00000044 chrome_child!gpu::gles2::GLES2DecoderImpl::DoCommandsImpl<0>+0xe1
  0058e7bc 68cfdb45 00000014 02ed6788 0058e7dc chrome_child!gpu::CommandParser::ProcessCommands+0x41
  0058e8e4 68cfed24 06486160 0647db38 0058ea18 chrome_child!gpu::CommandExecutor::PutChanged+0x1e7
  0058e8f4 67c93afe 000001c2 06486ce4 00000000 chrome_child!gpu::CommandBufferService::Flush+0x2a
  0058ea18 67c93911 000001c2 0000000e 0058ea4c chrome_child!gpu::GpuCommandBufferStub::OnAsyncFlush+0x1b8
  0058eaa8 67c929d6 064dd6d0 0647db38 0647db38 chrome_child!IPC::MessageT<GpuCommandBufferMsg_AsyncFlush_Meta,std::tuple<int,unsigned int,std::vector<ui::LatencyInfo,std::allocator<ui::LatencyInfo> >,std::vector<gpu::SyncToken,std::allocator<gpu::SyncToken> > >,void>::Dispatch<gpu::GpuCommandBufferStub,gpu::GpuCommandBufferStub,void,void (__thiscall gpu::GpuCommandBufferStub::*)(int,unsigned int,std::vector<ui::LatencyInfo,std::allocator<ui::LatencyInfo> > const &,std::vector<gpu::SyncToken,std::allocator<gpu::SyncToken> > const &)>+0xeb
  0058ec10 67c896df 064dd6d0 063c1f28 0647db38 chrome_child!gpu::GpuCommandBufferStub::OnMessageReceived+0x9fe
  0058ec24 67c84a92 064dd6d0 02ed5dc8 0058ec54 chrome_child!gpu::GpuChannel::HandleMessageHelper+0x2d
  0058ed00 668348d5 064dd5e0 064dd5e0 16b6dffd chrome_child!gpu::GpuChannel::HandleMessageOnQueue+0x124


It looks like there are either codegen or calling convention problems preventing ANGLE from working correctly.

Comment 2 by kbr@chromium.org, May 16 2017

Also: these tests require a real GPU for effectiveness, which is why they're run on physical hardware and triggered by win_chromium_rel_ng (and mac_, linux_).

Comment 3 by thakis@chromium.org, May 16 2017

Is angle not used by any other tests? https://build.chromium.org/p/chromium.fyi/console?category=win%20clang has many many test suites passing (browser_tests, unit_tests, etc)

Comment 4 by kbr@chromium.org, May 16 2017

Those tests are all running with software rendering. The fact that the VMs running those tests can't boot the GPU process was the main reason for bringing up the physical GPU hardware.

Note that you can trigger the GPU tests (as defined by the test suites listed in src/content/test/gpu/generate_buildbot_json.py) from any waterfall builder that can invoke its tests on Swarming.

I believe the second crash (gl::ComputeVertexAttributeStride) should be fixed by https://chromium-review.googlesource.com/c/505928/.  This will be rolled into Chrome today.

Comment 6 by thakis@chromium.org, May 16 2017

Is it ok if we add these tests to ~10 fairly slow-cycling bots? Compared to the CQ that should require negligible resources, right?

Comment 7 by kbr@chromium.org, May 16 2017

Yes, that should be fine. It would be ideal if you could generate those bots' JSON files from that generate_buildbot_json.py script, like was recently done for the client.v8.fyi waterfall, so they don't get out of sync. Would you be willing to do that?

Also, in for example https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromium_tests/chromium_gpu_fyi.py we use the 'serialize_tests' property to avoid triggering all of the GPU tests in parallel on a bunch of the waterfall bots to reduce load on the physical hardware. If cycle time isn't critical for your ~10 bots could we do the same on them?

Comment 8 by thakis@chromium.org, May 16 2017

The first stack is probably here https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gles2_cmd_decoder.cc?q=DoDrawElements+package:%5Echromium$&dr=CSs&l=10291 given that feature_info_ is a scoped_refptr. But from what I can see, that's always allocated here https://cs.chromium.org/chromium/src/gpu/ipc/service/gpu_command_buffer_stub.cc?dr=CSs&l=577 and then passed through, so I'm not sure how it'd end up corrupted.

Comment 9 by thakis@chromium.org, May 16 2017

Blockedon: 721783
Owner: jiawei.s...@intel.com
Status: Fixed (was: Untriaged)
Looks like https://chromium-review.googlesource.com/c/505928/ fixed this.

Sign in to add a comment