New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 795942 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 797368
Owner:
OOO until 2019-01-24
Closed: Jul 20
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug

Blocked on:
issue 797368



Sign in to add a comment

Flaky pixel_test and trace_test timeouts on Mac Intel GPU bots

Project Member Reported by ynovikov@chromium.org, Dec 18 2017

Issue description

Observed this on mac_chromium_rel_ng trybot and Mac Release (Intel) GPU.FYI bot. For mac_chromium_rel_ng these are real flakes, i.e. next build for the same CL/patchset was green. Strangely, pixel_test timeouts are only observed on the trybot, but not on GPU.FYI bot.

https://ci.chromium.org/buildbot/chromium.gpu.fyi/Mac%20Release%20%28Intel%29/10227
TraceTest_WebGLGreenTriangle_AA_NoAlpha
https://ci.chromium.org/buildbot/chromium.gpu.fyi/Mac%20Release%20%28Intel%29/10208
TraceTest_2DCanvasWebGL
https://ci.chromium.org/buildbot/chromium.gpu.fyi/Mac%20Release%20%28Intel%29/10159
TraceTest_WebGLGreenTriangle_AA_Alpha

https://ci.chromium.org/buildbot/tryserver.chromium.mac/mac_chromium_rel_ng/614398
TraceTest_2DCanvasWebGL
https://ci.chromium.org/buildbot/tryserver.chromium.mac/mac_chromium_rel_ng/614409
TraceTest_WebGLGreenTriangle_NoAA_Alpha

https://ci.chromium.org/buildbot/tryserver.chromium.mac/mac_chromium_rel_ng/614319
Pixel_WebGLGreenTriangle_NonChromiumImage_AA_Alpha
https://ci.chromium.org/buildbot/tryserver.chromium.mac/mac_chromium_rel_ng/614351
Pixel_WebGLTransparentGreenTriangle_NoAlpha_ImplicitClear

Not sure how to proceed with this, the timing out tests seem random to me, so I don't see much point in marking them flaky.

Looks like this may be going on for a long time, see issue 748110. But could be a different failure mode, not enough info on that bug.
 

Comment 1 by kbr@chromium.org, Dec 19 2017

Components: Blink>WebGL
One configuration difference between the trybots and waterfall bots (which I think should be eliminated, but it's not an easy decision) is that the trybots run with dcheck_always_on=true and the waterfall bots don't.

The trace_test failures are just reflections of the pixel_test failures.

It's suspicious that all of the failures are of the WebGL pixel tests. Let me check that test's source code and see if there's any obvious race condition.

Comment 2 by kbr@chromium.org, Dec 19 2017

Cc: perezju@chromium.org nedngu...@google.com
The tests' specifications are in src/content/test/gpu/gpu_tests/pixel_test_pages.py , which refer to web pages in src/content/test/data/gpu/ . They are all of the form:

<script src="pixel_webgl_util.js"></script>

<script>
var main = makeMain(true, true);
</script>
</head>
<body onload="main()">
...
</body>


So it depends on the onload handler working reliably, as well as requestAnimationFrame.

It also depends on Telemetry's script_to_evaluate_on_commit feature working as expected -- the script has to be committed at just the right time, before the page's onload handler is run.

There's no good reason that the WebGL tests in this directory should fail and for example the similarly structured Canvas2D tests shouldn't.

Ned, Juan: are we relying on brittle / fragile primitives that are inherently racy?

Comment 3 by kbr@chromium.org, Dec 19 2017

Status: Available (was: Unconfirmed)
This is how script_to_evaluate_on_commit is implemented by Telemetry:
https://github.com/catapult-project/catapult/blob/e3b4c57dcbbb729558d5d41fc2154d6f6a69ceae/telemetry/telemetry/internal/backends/chrome_inspector/inspector_page.py#L49

Not sure how those primitives are then evaluated by Chrome itself, or whether there is potential for races.
Cc: pfeldman@chromium.org
Pavel may know how 'Page.addScriptToEvaluateOnLoad'is implemented.

Comment 6 by kbr@chromium.org, Dec 20 2017

Components: Platform>DevTools>Platform
Any DevTools folks still around who might be able to comment on the robustness of Page.addScriptToEvaluateOnLoad?

Comment 7 by piman@chromium.org, Feb 5 2018

Is this P1? If so we need an owner.

Comment 8 by kbr@chromium.org, Feb 5 2018

Blockedon: 797368
Labels: -Pri-1 Pri-2
Owner: kbr@chromium.org
Status: Assigned (was: Available)
I think this is likely a duplicate of  Issue 797368 . We've been seeing random test failures mostly on the Mac Intel bots which are likely either a renderer process hang (on the IO thread) or a browser process hang. Need to make more changes to the Telemetry harness and probably base's child process teardown code so that we can get a symbolized minidump when this happens.

Blockedon: -797368
Mergedinto: 797368
Status: Duplicate (was: Assigned)
Per resolution of  Issue 797368  I don't see evidence of these timeouts on the trybots any more. Duplicating this into the other bug, which has been closed as WontFix.

Blockedon: 797368

Sign in to add a comment