New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 635545 link

Starred by 1 user

Issue metadata

Status: Duplicate
Owner:
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 1
Type: Bug-Regression

Blocking:
issue 596622
issue 619264



Sign in to add a comment

v8 roll causing some WebGL conformance tests to time out.

Project Member Reported by geoffl...@chromium.org, Aug 8 2016

Issue description

Some WebGL2 conformance tests are consistently timing out on the FYI waterfall.

Examples:
Win7 x64 Release (NVIDIA) times out on WebglConformance_deqp_functional_gles3_shaderoperator_common_fucntions : https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20x64%20Release%20%28NVIDIA%29/builds/6228

Linux Debug (New Intel) times out on WebglConformance_deqp_data_gles2_shaders_swizzles : https://build.chromium.org/p/chromium.gpu.fyi/builders/Linux%20Debug%20%28New%20Intel%29/builds/3277

Android Release (Nexus 5X) times out on WebglConformance_deqp_data_gles2_shaders_swizzles : https://build.chromium.org/p/chromium.gpu.fyi/builders/Android%20Release%20%28Nexus%205X%29/builds/1567

Android Release (Nexus 6P) times out on WebglConformance_deqp_data_gles2_shaders_swizzles : https://build.chromium.org/p/chromium.gpu.fyi/builders/Android%20Release%20%28Nexus%206P%29/builds/1322

The common factor appears to be this v8 roll: https://codereview.chromium.org/2225833002
 
Cc: machenb...@chromium.org
Owner: littledan@chromium.org
Status: Assigned (was: Available)
Dan, the timeout seems to occur deterministically after the roll. Can you please bisect? See https://www.chromium.org/developers/testing/webgl-conformance-tests for more information.

@geofflang are there trybots to easiliy repro this?
I would guess that there aren't good trybots for covering this because the roll made it through the CQ.  The most common failure (WebglConformance_deqp_data_gles2_shaders_swizzles) is running the test on this page: https://www.khronos.org/registry/webgl/sdk/tests/deqp/data/gles2/shaders/swizzles.html?webglVersion=1&quiet=0

To run it via command line, you can run:

./content/test/gpu/run_gpu_test.py webgl_conformance --browser=exact --browser-executable=%BROWSER_NAME% --webgl-conformance-version=1.0.4 --story-filter=deqp_data_gles2_shaders_swizzles

where %BROWSER_NAME% is "Debug", "Release", "Canary", etc
Cc: bakkot@chromium.org
Kevin, could you take a look at this? You've been looking at other issues in these tests. Feel free to assign back to me if needed.

Comment 4 by kbr@chromium.org, Aug 8 2016

Labels: -Type-Bug -Pri-3 OS-All Pri-1 Type-Bug-Regression
The command line above will run the WebGL 1.0 tests, not the WebGL 2.0 tests. To run those:

./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=exact --browser-executable=%BROWSER_NAME% --webgl-conformance-version=2.0.0 --test-filter=WebglConformance_deqp_functional_gles3_shaderoperator_common_fucntions

(sorry for the typo in the test name, but it's correct)

To run the WebGL 1.0 tests:

./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=exact --browser-executable=%BROWSER_NAME% --webgl-conformance-version=1.0.4 --test-filter=deqp_data_gles2_shaders_swizzles

I think this is higher priority than P3. Raising it to P1. Good work Geoff tracking it down to that particular V8 roll. I think it is warranted to disable the V8 autoroller and roll V8 back to the revision before that roll. That is the procedure used when regressions make it in to the tree.

Comment 5 by kbr@chromium.org, Aug 8 2016

Blocking: 596622 619264
Disabled the v8 auto-roll and submitted a CL to roll back the v8 revision here https://codereview.chromium.org/2227553004/
Cc: danno@chromium.org
CC danno. I'm generally not supporting rolling back v8 because of FYI bots. This basically gives a full stop to all critical V8 development (as we can only permit a certain number of critical CLs per roll and per canary).

When V8 is rolled back we need the necessary tools (e.g. trybots or machines with the right hardware) and guidance to reproduce the problems and quickly find a V8-side culprit. E.g. will this reproduce on a random developer workstation?

If littledan doesn't find a solution during PST time, more help from the gpu team would be very welcome.

Comment 8 by kbr@chromium.org, Aug 8 2016

Assuming you have a release build, a better command line to use might be, for the WebGL 2.0 tests:

./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=release --webgl-conformance-version=2.0.0 --test-filter=WebglConformance_deqp_functional_gles3_shaderoperator_common_fucntions

or for the WebGL 1.0 tests:

./content/test/gpu/run_gpu_integration_test.py webgl_conformance --browser=release --webgl-conformance-version=1.0.4 --test-filter=deqp_data_gles2_shaders_swizzles

Comment 9 by kbr@chromium.org, Aug 8 2016

I understand the predicament this puts the V8 team in, but please understand the situation our team is in as well. Our tests have randomly started to time out, they run only on the FYI waterfall, and we don't know how high priority an issue this is for the V8 team to investigate.

You have my full support and I stand ready to help you try to figure out what happened.

Labels: Hotlist-PixelWrangler

Comment 11 by kbr@chromium.org, Aug 9 2016

Update: sunnyps@ is actively trying to reproduce this on Windows (thanks Sunny). We've also discussed, and he's investigated, some other assertion failures in the memory allocator during V8's compilation which seem unrelated, so we're ignoring those for the moment and focusing on the timeouts.

Comment 12 by bakkot@google.com, Aug 9 2016

As far as I can tell the issue manifests prior to the above-mentioned roll, so I bisected on Chromium and then V8.

The issue actually appears to have been from a previous V8 roll, https://codereview.chromium.org/2223473002. The commit specifically at issue is https://codereview.chromium.org/2180273002 (author cc'd), which causes an at least factor-of-two slowdown on 'swizzles' on my workstation. This is also backed up by that commit's message, which suggests it's a risk for performance regression.

jkummerow, do you want to take a look?

Comment 13 by bakkot@google.com, Aug 9 2016

Cc: jkummerow@chromium.org
+cc jkummerow@
Cc: ishell@chromium.org bmeu...@chromium.org
Owner: jkummerow@chromium.org
Thanks for your help and the bisection. Assigning to jkummerow.

I'll wait with reverting that CL until it's clear if no other CLs build upon it.
Running an experiment to compare the optional_gpu trybots on https://codereview.chromium.org/2224283002/ with the same on a whitespace CL:
https://codereview.chromium.org/1339923005/ - though, I couldn't read in this bug if the optional gpu_tests will show the issue.
Thanks Kevin and Dan. 

kbr@ I am fully in support what machenbach@ said. WebGL related reverts are currently V8's main source of reverts. I am wondering what changed in the past 3 months? We already added a bunch of webgl trybots to our roller because of this. Maybe it makes sense to provide us with a special trybot which runs a few tests that V8 is susceptible to or V8 broke in the past? 
We should create a separate artifact, e.g. a doc, to discuss what we could do better in terms of webgl and v8.
Project Member

Comment 20 by bugdroid1@chromium.org, Aug 9 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/d9d719e7a89dfcb90365404574f452009f2cb409

commit d9d719e7a89dfcb90365404574f452009f2cb409
Author: hablich <hablich@chromium.org>
Date: Tue Aug 09 07:17:27 2016

Revert of [KeyedLoadIC] Support Smi "handlers" for element loads (patchset #5 id:80001 of https://codereview.chromium.org/2180273002/ )

Reason for revert:
Times out webgl errors: https://bugs.chromium.org/p/chromium/issues/detail?id=635545

Original issue's description:
> [KeyedLoadIC] Support Smi "handlers" for element loads
>
> This is an experiment as far as performance is concerned. If Smi-configured
> element loading directly from the dispatcher stub is fast enough, then we
> can stop compiling LoadFastElementStubs (and drop the corresponding code).
>
> Committed: https://crrev.com/c9308147b341596de2733039223918a6202afa5f
> Cr-Commit-Position: refs/heads/master@{#38377}

BUG= chromium:635545 
TBR=ishell@chromium.org,jkummerow@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.

Review-Url: https://codereview.chromium.org/2222273003
Cr-Commit-Position: refs/heads/master@{#38473}

[modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/code-stub-assembler.cc
[modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/code-stub-assembler.h
[modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/field-index-inl.h
[modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/field-index.h
[modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/ic/handler-compiler.cc
[modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/ic/handler-compiler.h
[delete] https://crrev.com/e7609ecb014a46e1472939a43e1336c84007614f/src/ic/handler-configuration.h
[modify] https://crrev.com/d9d719e7a89dfcb90365404574f452009f2cb409/src/ic/ic.cc

Regarding win and mac my experiment was not really conclusive, but on linux, the revert of the aforementioned CL seems to improve the runtime from 15 to 7 minutes:
https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_optional_gpu_tests_rel/builds/1904

I checked and it was 7 minutes before the recent v8 roll. Only the last several builds took 15 minutes on the webgl2_conformance_test.
Mergedinto: 634996
Status: Duplicate (was: Assigned)

Comment 23 by kbr@chromium.org, Aug 9 2016

hablich@, machenbach@: I understand your concern about the WebGL tests being a major source of V8 reverts. I hope you appreciate our predicament, where our team's tests run the most JavaScript by volume of any tests on Chromium's waterfall, so we must aggressively triage and remove flakiness that is introduced in order to keep our tests running reliably.

I think there are several steps that might be taken to improve visibility of problems before they reach the Chromium tree. Let's discuss offline and/or via a Google Doc.

Thanks kbr! And thanks for your team's help to investigate this issue!

Sign in to add a comment