Issue metadata
Sign in to add a comment
|
False rejects caused by webgl_conformance_tests. |
||||||||||||||||||||||||
Issue descriptionUnrelated CL: https://chromium-review.googlesource.com/c/chromium/src/+/1381231/2 Build that failed: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/211883 Next build succeeded: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/211940 Failing test: gpu_tests.webgl_conformance_integration_test.WebGLConformanceIntegrationTest.WebglConformance_conformance_ogles_GL_not_not_001_to_004 In the period from Dec. 18 to Dec. 19, the top two sources of CQ false rejects due to flaky tests were: """ webgl_conformance_tests on Intel GPU on Mac (with patch) on Mac-10.12.6 webgl_conformance_tests on ATI GPU on Mac Retina (with patch) on Mac-10.13.6 """ https://datastudio.google.com/c/reporting/12dYEpcepJ5_6ZOhprbd5GpDNooiUJONV/page/AYfX [Modify the link to the relevant date range, and then specify Build Failure Type: TEST_FAILURE] Note that we recently turned off "retry with patch" for WebGL conformance tests. As noted here: https://docs.google.com/document/d/1O9nzVMA6rEe2-rhjsni_8wS9Lw4OnFpK2NE0wWB5VaY/edit#heading=h.ef0u5ezii5be retries exponentially reduce false rejects, and sublinearly increase the rate at which flakiness is introduced to tip of tree. For now, let's let the GPU team deal with the flakiness in these test suites.
,
Dec 20
Thanks for the update, Ken. For now, let's keep the current behavior [no retries]. > In my opinion it would have been a mistake to institute any more retries which would have made these failures less visible. If we reach a state where the failures are equally visible [e.g. FindIt catches these errors even when we retry], then I think we should reconsider instituting retries. In that case, we'll still be filing bugs on the errors, but they won't cause a significant amount of false-rejects for the rest of Chrome devs.
,
Dec 20
I didn't want to interject into the prior discussion but would like to offer my vote against flaky retries. I often diagnoses and report upstream errors that are not in my code but affect me. I'd much rather have those errors be visible and immediately problematic than have any kind of mode that allows people to ignore them. It makes more convincing to revert problematic CLs when you can say "this breaks my CQ" than "this makes my CQ take an extra hour to detect flaky tests and sometimes makes it fail".
,
Dec 20
> I'd much rather have those errors be visible and immediately problematic than have any kind of mode that allows people to ignore them. These should be immediately visible to the teams that care [e.g. your team, GPU team, etc.] > that allows people to ignore them There are hundreds of Chrome devs. Most of them would like to not be impacted when flakiness is introduced into a test suite that they know nothing about. > It makes more convincing to revert problematic CLs when you can say "this breaks my CQ" than "this makes my CQ take an extra hour to detect flaky tests and sometimes makes it fail". That's a policy change we can make. Please see https://docs.google.com/document/d/1O9nzVMA6rEe2-rhjsni_8wS9Lw4OnFpK2NE0wWB5VaY/edit#heading=h.ef0u5ezii5be for more details.
,
Dec 20
,
Dec 20
> That's a policy change we can make. It is, but I expect it won't be the popular option. Most devs don't seem to want flakes to block the CQ. It would probably be more popular if we could show that the flakes are new and that people will actively triage and resolve them. Both things are probably true for the GPU test suites, but not true generally for test suites elsewhere. (This is, of course, something that we're hoping to change, and eventually get to the point where we are dealing w/ flakes better everywhere). |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by kbr@chromium.org
, Dec 19Cc: chanli@chromium.org st...@chromium.org
Components: -Internals>GPU Tools>Test>FindIt Blink>WebGL Internals>GPU>Internals
Mergedinto: 916544
Owner: kbr@chromium.org
Status: Duplicate (was: Untriaged)