Issue metadata
Sign in to add a comment
|
v8 roll causing some WebGL conformance tests to time out. |
||||||||||||||||||||||
Issue descriptionSome WebGL2 conformance tests are consistently timing out on the FYI waterfall. Examples: Win7 x64 Release (NVIDIA) times out on WebglConformance_deqp_functional_gles3_shaderoperator_common_fucntions : https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20x64%20Release%20%28NVIDIA%29/builds/6228 Linux Debug (New Intel) times out on WebglConformance_deqp_data_gles2_shaders_swizzles : https://build.chromium.org/p/chromium.gpu.fyi/builders/Linux%20Debug%20%28New%20Intel%29/builds/3277 Android Release (Nexus 5X) times out on WebglConformance_deqp_data_gles2_shaders_swizzles : https://build.chromium.org/p/chromium.gpu.fyi/builders/Android%20Release%20%28Nexus%205X%29/builds/1567 Android Release (Nexus 6P) times out on WebglConformance_deqp_data_gles2_shaders_swizzles : https://build.chromium.org/p/chromium.gpu.fyi/builders/Android%20Release%20%28Nexus%206P%29/builds/1322 The common factor appears to be this v8 roll: https://codereview.chromium.org/2225833002
,
Aug 8 2016
I would guess that there aren't good trybots for covering this because the roll made it through the CQ. The most common failure (WebglConformance_deqp_data_gles2_shaders_swizzles) is running the test on this page: https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles2/shaders/swizzles.html?webglVersion=1&quiet=0 To run it via command line, you can run: ./content/test/gpu/run_gpu_test.py webgl_conformance --browser=exact --browser-executable=%BROWSER_NAME% --webgl-conformance-version=1.0.4 --story-filter=deqp_data_gles2_shaders_swizzles where %BROWSER_NAME% is "Debug", "Release", "Canary", etc
,
Aug 8 2016
Kevin, could you take a look at this? You've been looking at other issues in these tests. Feel free to assign back to me if needed.
,
Aug 8 2016
The command line above will run the WebGL 1.0 tests, not the WebGL 2.0 tests. To run those: ./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=exact --browser-executable=%BROWSER_NAME% --webgl-conformance-version=2.0.0 --test-filter=WebglConformance_deqp_functional_gles3_shaderoperator_common_fucntions (sorry for the typo in the test name, but it's correct) To run the WebGL 1.0 tests: ./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=exact --browser-executable=%BROWSER_NAME% --webgl-conformance-version=1.0.4 --test-filter=deqp_data_gles2_shaders_swizzles I think this is higher priority than P3. Raising it to P1. Good work Geoff tracking it down to that particular V8 roll. I think it is warranted to disable the V8 autoroller and roll V8 back to the revision before that roll. That is the procedure used when regressions make it in to the tree.
,
Aug 8 2016
Disabled the v8 auto-roll and submitted a CL to roll back the v8 revision here https://codereview.chromium.org/2227553004/
,
Aug 8 2016
CC danno. I'm generally not supporting rolling back v8 because of FYI bots. This basically gives a full stop to all critical V8 development (as we can only permit a certain number of critical CLs per roll and per canary). When V8 is rolled back we need the necessary tools (e.g. trybots or machines with the right hardware) and guidance to reproduce the problems and quickly find a V8-side culprit. E.g. will this reproduce on a random developer workstation? If littledan doesn't find a solution during PST time, more help from the gpu team would be very welcome.
,
Aug 8 2016
Assuming you have a release build, a better command line to use might be, for the WebGL 2.0 tests: ./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=release --webgl-conformance-version=2.0.0 --test-filter=WebglConformance_deqp_functional_gles3_shaderoperator_common_fucntions or for the WebGL 1.0 tests: ./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=release --webgl-conformance-version=1.0.4 --test-filter=deqp_data_gles2_shaders_swizzles
,
Aug 8 2016
I understand the predicament this puts the V8 team in, but please understand the situation our team is in as well. Our tests have randomly started to time out, they run only on the FYI waterfall, and we don't know how high priority an issue this is for the V8 team to investigate. You have my full support and I stand ready to help you try to figure out what happened.
,
Aug 8 2016
,
Aug 9 2016
Update: sunnyps@ is actively trying to reproduce this on Windows (thanks Sunny). We've also discussed, and he's investigated, some other assertion failures in the memory allocator during V8's compilation which seem unrelated, so we're ignoring those for the moment and focusing on the timeouts.
,
Aug 9 2016
As far as I can tell the issue manifests prior to the above-mentioned roll, so I bisected on Chromium and then V8. The issue actually appears to have been from a previous V8 roll, https://codereview.chromium.org/2223473002. The commit specifically at issue is https://codereview.chromium.org/2180273002 (author cc'd), which causes an at least factor-of-two slowdown on 'swizzles' on my workstation. This is also backed up by that commit's message, which suggests it's a risk for performance regression. jkummerow, do you want to take a look?
,
Aug 9 2016
+cc jkummerow@
,
Aug 9 2016
,
Aug 9 2016
Thanks for your help and the bisection. Assigning to jkummerow. I'll wait with reverting that CL until it's clear if no other CLs build upon it.
,
Aug 9 2016
Running an experiment to compare the optional_gpu trybots on https://codereview.chromium.org/2224283002/ with the same on a whitespace CL: https://codereview.chromium.org/1339923005/ - though, I couldn't read in this bug if the optional gpu_tests will show the issue.
,
Aug 9 2016
Thanks Kevin and Dan. kbr@ I am fully in support what machenbach@ said. WebGL related reverts are currently V8's main source of reverts. I am wondering what changed in the past 3 months? We already added a bunch of webgl trybots to our roller because of this. Maybe it makes sense to provide us with a special trybot which runs a few tests that V8 is susceptible to or V8 broke in the past?
,
Aug 9 2016
Revert in CQ: https://codereview.chromium.org/2222273003/
,
Aug 9 2016
We should create a separate artifact, e.g. a doc, to discuss what we could do better in terms of webgl and v8.
,
Aug 9 2016
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/d9d719e7a89dfcb90365404574f452009f2cb409 commit d9d719e7a89dfcb90365404574f452009f2cb409 Author: hablich <hablich@chromium.org> Date: Tue Aug 09 07:17:27 2016 Revert of [KeyedLoadIC] Support Smi "handlers" for element loads (patchset #5 id:80001 of https://codereview.chromium.org/2180273002/ ) Reason for revert: Times out webgl errors: https://bugs.chromium.org/p/chromium/issues/detail?id=635545 Original issue's description: > [KeyedLoadIC] Support Smi "handlers" for element loads > > This is an experiment as far as performance is concerned. If Smi-configured > element loading directly from the dispatcher stub is fast enough, then we > can stop compiling LoadFastElementStubs (and drop the corresponding code). > > Committed: https://crrev.com/c9308147b341596de2733039223918a6202afa5f > Cr-Commit-Position: refs/heads/master@{#38377} BUG= chromium:635545 TBR=ishell@chromium.org,jkummerow@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. Review-Url: https://codereview.chromium.org/2222273003 Cr-Commit-Position: refs/heads/master@{#38473} [modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/code-stub-assembler.cc [modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/code-stub-assembler.h [modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/field-index-inl.h [modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/field-index.h [modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/ic/handler-compiler.cc [modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/ic/handler-compiler.h [delete] https://crrev.com/e7609ecb014a46e1472939a43e1336c84007614f/src/ic/handler-configuration.h [modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/ic/ic.cc
,
Aug 9 2016
Regarding win and mac my experiment was not really conclusive, but on linux, the revert of the aforementioned CL seems to improve the runtime from 15 to 7 minutes: https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_optional_gpu_tests_rel/builds/1904 I checked and it was 7 minutes before the recent v8 roll. Only the last several builds took 15 minutes on the webgl2_conformance_test.
,
Aug 9 2016
,
Aug 9 2016
hablich@, machenbach@: I understand your concern about the WebGL tests being a major source of V8 reverts. I hope you appreciate our predicament, where our team's tests run the most JavaScript by volume of any tests on Chromium's waterfall, so we must aggressively triage and remove flakiness that is introduced in order to keep our tests running reliably. I think there are several steps that might be taken to improve visibility of problems before they reach the Chromium tree. Let's discuss offline and/or via a Google Doc.
,
Aug 10 2016
Thanks kbr! And thanks for your team's help to investigate this issue! |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by hablich@chromium.org
, Aug 8 2016Owner: littledan@chromium.org
Status: Assigned (was: Available)