Components: Blink>SVG Labels: -OS-Linux -OS-Windows OS-All Owner: pdr@chromium.org Status: Assigned (was: Untriaged) Summary: Some css3/filter/effect-*-hw.html are flaky (was: css3/filters/effect-hue-rotate-hw.html fails on Win7 and Linux)
Other css3/filter/effect-*-hw.html are also flaky, not only on platforms on which the tests are rebaselined (http://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=css3%2Ffilters%2Feffect-):
css3/filters/effect-brightness-clamping-hw.html
css3/filters/effect-brightness-hw.html
css3/filters/effect-hue-rotate-hw.html
css3/filters/effect-saturate-hw.html
css3/filters/effect-sepia-hw.html
pdr@ can you take a look?
The difference between the images is definitely what I would expect to see when the optimization was and wasn't running. I'll take a look and see if I can understand why it's being flakily applied. At worst, maybe we need a layout tests setting to run in a consistent mode.
This is extremely odd. Running locally, I get four flaky failures:
effect-sepia-hw
effect-hue-rotate-hw
effect-saturate-hw
effect-contrast-hw
contrast isn't on the flakiness dashboard, but should be. I can't repro brightness-hw or brightness-clamping-hw.
printf debugging suggests that my optimization in GL renderer is running every time, and every time there is a single color filter at the root of the dag. It's possible that this is just exposing preexisting flakiness, and it's not clear to me yet where this is coming from.
It does appear to be my change. If I comment out the render pass skipping optimization and rebaseline all the tests, then they pass consistently through a number of retries.
The flakes appear similar to the rebaselines, where the cause is slightly different texture sampling of a bitmap that's being stretched (whereas before the optimization the render pass was then being sampled 1:1 texel to screen pixels). The images are not perceptually different when there are failures.
With my patch, the texture uniforms appear to be identical when the tests pass and fail. I am not able to discern any different behavior inside of cc between passing and failing tests. Skia filters aren't used. These all go through the "color matrix" shader path in GLRenderer and don't use Ganesh.
Another test: I resized the reference.png to be 256x256 so that the render pass and the tile quad would be the same size. I rebaselined all the tests. With that, the flakes disappeared. So, it does appear to be somewhat texture sampling related.
A final test: I took my patch and rebaselined all the tests with the results from desktop gl. I ran all the tests 10 times both in batches and singly and there were zero failures.
Therefore, so far as I can tell, this appears to be a texture sampling flakiness bug in mesa. T_T
Possible solutions:
(1) add whitespace to reference.png so that it's 256x256. (And hope in the future that this would be the tile size that a layer would choose for such a layer size.)
(2) add hella plumbing to turn off this optimization only for layout tests, somehow. (And hope that no other changes end up tickling this particular bug.)
(3) Fix Mesa. (Mostly kidding. I don't think it's worth the time.)
Thoughts?
I suspect it's because these tests are the only ones that create graphics layers (so don't just raster with skia software), only have a color filter (so don't use ganesh), and are not reference filters (which skip the optimization in my patch). There's a couple others (grayscale?) that fit into that category but aren't flaky, but I'm not sure why.
Thanks so much for looking into this! It's definitely possible that Mesa is buggy, but generally it tends to be deterministic, rather than random, so I'm surprised at Mesa-induced flakiness.
There is a project underway to switch layout tests to use SwiftShader instead of Mesa, which should help to isolate the Mesa-specific problems. If that's the case, it should go away with the switch.
Comment 1 by bugdroid1@chromium.org
, Jun 21 2016