Reduce flakiness of webkit_layout_test retries. |
||||||||||
Issue descriptionSummary: webkit_layout_test retries are much more likely to fail because the driver sends "#READY" before it is actually ready. There are ~80,000 tests in webkit_layout_tests. Some of these are flaky. When a flaky test fails, it is exceedingly likely to fail on retries. Let's look at an example: http/tests/devtools/console-xhr-logging.js. In this CQ run [CL is unrelated]: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/89223 The test fails during a run of the full test suite. It is subsequently retried 3 times [in isolation]. Each of the retries fails. The recipe then rolls tip of tree, rebuilds, recompiles, and starts a new swarming job, which only runs this test. The test once again fails. The test passes when the same CL is retried on the CQ: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/89261 This pattern is very common -- for two more examples, see https://crbug.com/888734#c1. This suggests that tests are much more likely to fail when run in isolation. The overall structure of run_web_tests.py has the following format. tests_to_run = FindListOfTestsToRun() sharded_tests_to_run = Shard(tests_to_run) for shard in sharded_tests_to_run: driver = StartDriver() for test in shard: success = driver.Run(test) if not success: failures.append(test) driver.Stop() for test in failures: for retry in retry_count: driver = StartDriver() driver.Run(test) driver.Stop() We see that tests run in isolation and retried failures share one common feature: They are run immediately after starting the Driver(). This suggests that there is some non-determinism/races that occur when first starting the test Driver. Let's look at how that happens: * driver.py() starts the binary and waits for #READY. It then sends data down to the driver. * The driver sends #READY very early in the app lifecycle -- the run loop has not been run a single time! This means that many singletons have not been created, and tasks posted by earlier initialization steps have never had a chance to run. * The driver starts running tests immediately after sending #READY -- again, before initialization has actually finished! I'm going to change the driver so that #READY is sent at a more reasonable time. I expect this to significantly reduce failures on retries and when tests are run in isolation. ⛆ |
|
|
,
Sep 25
Great find!
,
Sep 26
CL is up: https://chromium-review.googlesource.com/c/chromium/src/+/1244016 The CL causes some webkit_layout_tests to fail. This appears to be actual bugs in the layout tests themselves. e.g. I can now reproduce deterministic layout test failures using tip of tree build of blink_tests. """ $ cat ./test_list html/details_summary/details-add-child-1.html external/wpt/cookies/samesite/form-post-blank.html $ third_party/blink/tools/run_web_tests.py -t gn --driver-logging --test-list=./test_list -j 1 --order=none """ external/wpt/cookies/samesite/form-post-blank.html always fails when run after html/details_summary/details-add-child-1.html. See https://chromium-review.googlesource.com/c/chromium/src/+/1244016#message-ee9411d622b5d76beaa1d59a98f6a067b765d2b3 for more details.
,
Sep 26
,
Sep 26
,
Sep 28
I've been trying to land https://chromium-review.googlesource.com/c/chromium/src/+/1246723 Unfortunately, it is exposing that there are many tests which dependent on the non-deterministic ordering of IPCs during early lifecycle of the content shell -- see https://bugs.chromium.org/p/chromium/issues/detail?id=889952#c5.
,
Sep 28
You should feel fully empowered to disable tests if you need to to land this change.
,
Oct 2
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/026da0aaf6c13516addf2fb7744d501919f1dac9 commit 026da0aaf6c13516addf2fb7744d501919f1dac9 Author: erikchen <erikchen@chromium.org> Date: Tue Oct 02 19:28:34 2018 Uniform initialization for WebContents in layout tests. This CL begins the process of removing initialization races from Blink layout tests. Prior to this CL, a newly created main_window_ used a different navigation mechanism than a reused main_window_. The latter would use a navigation with ui::PAGE_TRANSITION_LINK [with an already constructed renderer]. The former would spawn a renderer during the ui::PAGE_TRANSITION_TYPED navigation to the layout test URL. This caused races between IPCs that occur during renderer construction, and IPCs that occur as a result of loading the layout test. This CL doesn't fix these races yet -- it simply converts initialization to use the same mechanism for both cases [using ui::PAGE_TRANSITION_LINK with an already constructed renderer]. Change-Id: I5a21bfdd9a47425f1eb9579235da5ffb4fbc19f7 Bug: 889036 , 889952 Reviewed-on: https://chromium-review.googlesource.com/1257683 Reviewed-by: Avi Drissman <avi@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#595944} [modify] https://crrev.com/026da0aaf6c13516addf2fb7744d501919f1dac9/content/shell/browser/layout_test/blink_test_controller.cc
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0908efa5c766779e14e3d0a5472b9d0058ce9e85 commit 0908efa5c766779e14e3d0a5472b9d0058ce9e85 Author: erikchen <erikchen@chromium.org> Date: Wed Oct 03 19:32:13 2018 Flush WidgetInputHandler messages before running layout tests. Layout test initialization is non-deterministic. WidgetInputHandler::OnFocus() races against the contents of the layout test, which may also attempt to set/unset focus -- the ordering between the two is non-deterministic. This CL adds a FlushForTesting() message to WidgetInputHandler to make the ordering explicit. Bug: 889036 , 889952 Change-Id: I26adca82915e75a9941c93b60a555f0c16084014 Reviewed-on: https://chromium-review.googlesource.com/c/1255782 Reviewed-by: Daniel Cheng <dcheng@chromium.org> Reviewed-by: Avi Drissman <avi@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#596321} [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/browser/renderer_host/render_widget_host_impl.cc [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/browser/renderer_host/render_widget_host_impl.h [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/common/input/input_handler.mojom [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/public/browser/render_widget_host.h [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/renderer/input/widget_input_handler_impl.cc [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/renderer/input/widget_input_handler_impl.h [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/shell/browser/layout_test/blink_test_controller.cc [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/test/mock_widget_input_handler.cc [modify] https://crrev.com/0908efa5c766779e14e3d0a5472b9d0058ce9e85/content/test/mock_widget_input_handler.h
,
Oct 3
,
Oct 4
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6764c9cf066c5f69f28a4bf39d81dabde12f13a8 commit 6764c9cf066c5f69f28a4bf39d81dabde12f13a8 Author: erikchen <erikchen@chromium.org> Date: Thu Oct 04 19:17:00 2018 Reduce non-determinism in blink layout tests. Explicit calls to focus/activate after loading the URL will cause races with the layout test, which may also modify the focus/activation of the web page. This CL changes the logic so that the layout test runner first focuses the RenderWidgetHost, waits for the state to synchronize with the renderer, and then proceeds to load the test URL. This CL only targets the fresh content::Shell case. Reducing non-determinism during reuse of content::Shells will happen in a future CL. Change-Id: Id64771f5bd86e8784c1ebd92df2b337454642c21 Bug: 889036 Reviewed-on: https://chromium-review.googlesource.com/c/1259724 Reviewed-by: Avi Drissman <avi@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#596798} [modify] https://crrev.com/6764c9cf066c5f69f28a4bf39d81dabde12f13a8/content/shell/browser/layout_test/blink_test_controller.cc
,
Oct 4
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9066671848a61b27b85e353cf113da9eba8ad56f commit 9066671848a61b27b85e353cf113da9eba8ad56f Author: Erik Chen <erikchen@chromium.org> Date: Thu Oct 04 20:06:47 2018 Reduce non-determinism for content::Shell reuse in layout tests. Previously, focusing the RenderWidgetHost would race with loading the test URL. This CL first focuses the RenderWidgetHost, waits for the state to synchronize with the renderer, and then loads the test URL. Change-Id: I6e78d14d65a8a1f3f8e7743d3b805fba878595f8 Bug: 889036 Reviewed-on: https://chromium-review.googlesource.com/c/1262116 Commit-Queue: Erik Chen <erikchen@chromium.org> Reviewed-by: Avi Drissman <avi@chromium.org> Cr-Commit-Position: refs/heads/master@{#596818} [modify] https://crrev.com/9066671848a61b27b85e353cf113da9eba8ad56f/content/shell/browser/layout_test/blink_test_controller.cc
,
Oct 4
I have a CL which resets content shell between each test run: https://chromium-review.googlesource.com/c/chromium/src/+/1259722 It passes all tests, but I don't plan on landing it right now because it makes the tests take 2-2.5X longer. + jochen, dcheng -- we should discuss long term strategy here in more detail at plaintext. I'm hoping that all my changes will reduce flakiness enough that we can continue to reuse renderers for faster layout tests.
,
Oct 4
Erik: what if create 3 tabs at once, then use the next tab for the next test, and kill the refresh the first tab while the tests are running the 2nd tab. Basically with this, I hope we can create new tab in parallel with tests running to save the cost of instantiating a new render process. I don't know much about the content shell, so apologize in advance if this idea sounds silly.
,
Oct 4
I like the core idea of increasing parallelism, but we'll need to refine the approach a bit. layout tests need to be in the foreground tab to function properly. We could theoretically create multiple windows to simultaneously run different layout tests...but that could interfere with tests that require focus, activation, fullscreen, etc. We currently achieve paralleism by spawning multiple drivers per swarming task -- see the "--jobs" flag. This has comparable load on the physical device, but avoids state leaking between tests running in parallel. We can probably make this more efficient. If you want to discuss further -- please spawn a new crbug. I'd like to keep this crbug focused on flakiness reductions.
,
Oct 4
Erik: I don't mean to mean to run tests in parallel. I mean we parallelize the step of runnning tests & create new renderer processes, so we don't have the cost of renderer creation slowing down the test's wall time.
,
Oct 4
Oh! I misunderstood. Yes, that could work. Note that I expect that we're CPU constrained on each swarming task [we're already running many jobs in parallel], so if we're doing the same amount of work, I would expect the total time to be similar. That being said, it's worth trying out if we decide it's necessary to restart content::Shell between test runs.
,
Oct 4
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b6e7beea57f50829ad8c8948a2513cf47c889a0f commit b6e7beea57f50829ad8c8948a2513cf47c889a0f Author: erikchen <erikchen@chromium.org> Date: Thu Oct 04 21:35:30 2018 Flush test control interface before starting layout test. The test control is responsible for synchronizing certain renderer state (such as Focus). The test runner must wait for this state to synchronize before launching the layout test, as otherwise there will be races. Change-Id: I6a01ae66ff4b6040184b86035576a32968010499 Bug: 889036 Reviewed-on: https://chromium-review.googlesource.com/c/1262375 Reviewed-by: Avi Drissman <avi@chromium.org> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#596863} [modify] https://crrev.com/b6e7beea57f50829ad8c8948a2513cf47c889a0f/content/shell/browser/layout_test/blink_test_controller.cc [modify] https://crrev.com/b6e7beea57f50829ad8c8948a2513cf47c889a0f/content/shell/common/layout_test.mojom [modify] https://crrev.com/b6e7beea57f50829ad8c8948a2513cf47c889a0f/content/shell/renderer/layout_test/layout_test_render_frame_observer.cc [modify] https://crrev.com/b6e7beea57f50829ad8c8948a2513cf47c889a0f/content/shell/renderer/layout_test/layout_test_render_frame_observer.h
,
Oct 6
> We could theoretically create multiple windows to simultaneously run different > layout tests...but that could interfere with tests that require focus, > activation, fullscreen, etc. Do you mean multiple windows per content_shell? We currently do create multiple windows, one for each job/child process. We already run the bots under a lot of memory (and CPU) pressure, and I expect that that alone is contributing to flakiness. I would be pretty reluctant to add even more processes running concurrently on the machine.
,
Oct 10
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/10175580ca11afa28314a5851c2868d9d9906084 commit 10175580ca11afa28314a5851c2868d9d9906084 Author: Daniel Cheng <dcheng@chromium.org> Date: Wed Oct 10 20:50:08 2018 Use the builtin FlushForTesting rather than defining new Mojo messages. Bug: 889036 , 889952 Change-Id: If4f52524e2378589c457f7bb0b41511d8356c05b Reviewed-on: https://chromium-review.googlesource.com/c/1272024 Reviewed-by: Dmitry Gozman <dgozman@chromium.org> Reviewed-by: Erik Chen <erikchen@chromium.org> Commit-Queue: Daniel Cheng <dcheng@chromium.org> Cr-Commit-Position: refs/heads/master@{#598491} [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/browser/renderer_host/render_widget_host_impl.cc [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/browser/renderer_host/render_widget_host_impl.h [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/common/input/input_handler.mojom [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/public/browser/render_widget_host.h [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/renderer/input/widget_input_handler_impl.cc [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/renderer/input/widget_input_handler_impl.h [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/shell/browser/layout_test/blink_test_controller.cc [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/shell/browser/layout_test/blink_test_controller.h [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/shell/common/layout_test.mojom [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/shell/renderer/layout_test/layout_test_render_frame_observer.cc [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/shell/renderer/layout_test/layout_test_render_frame_observer.h [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/test/mock_widget_input_handler.cc [modify] https://crrev.com/10175580ca11afa28314a5851c2868d9d9906084/content/test/mock_widget_input_handler.h
,
Oct 16
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5524af6018c603cd943c8f76ade1e407a5459ad7 commit 5524af6018c603cd943c8f76ade1e407a5459ad7 Author: erikchen <erikchen@chromium.org> Date: Tue Oct 16 19:21:27 2018 Add functionality to restart content shell between test runs. Many layout tests will flakily fail when run in a clean content shell, but will deterministically fail or pass when run in a reused content shell. Unfortunately, we cannot restart content shell between all tests because that significantly increases run time. [2X on Windows, 3X on Linux, 5X on macOS]. This CL adds the functionality to restart content shell between all test runs, but then only enables it for tests with --gtest_repeat or --repeat-each with a repeat count > 1. Effects: * For a local developer, this will only have an effect when using --gtest_repeat or --repeat-each. In this case, the developer is already trying to tease out some type of non-determinism between tests, and it's important to use a clean content shell. With this CL, flaky tests can be easily reproduced with --gtest_repeat, whereas they could not before. See https://bugs.chromium.org/p/chromium/issues/detail?id=894527#c5 for examples. * On the CQ, this will only have an effect on 'retry without patch' and 'retry with patch', which retry failing tests looking for flakiness. This will help the CQ better detect tests that are flaky on ToT to reduce false rejects. Change-Id: I4c04e382e733f8e1b40f4a6dde78292a41126a1b Bug: 875884, 889036 Reviewed-on: https://chromium-review.googlesource.com/c/1273810 Reviewed-by: Aleks Totic <atotic@chromium.org> Reviewed-by: Emil A Eklund <eae@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: Avi Drissman <avi@chromium.org> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#600077} [modify] https://crrev.com/5524af6018c603cd943c8f76ade1e407a5459ad7/content/shell/browser/layout_test/blink_test_controller.cc [modify] https://crrev.com/5524af6018c603cd943c8f76ade1e407a5459ad7/content/shell/common/layout_test/layout_test_switches.cc [modify] https://crrev.com/5524af6018c603cd943c8f76ade1e407a5459ad7/content/shell/common/layout_test/layout_test_switches.h [modify] https://crrev.com/5524af6018c603cd943c8f76ade1e407a5459ad7/third_party/blink/tools/blinkpy/web_tests/port/base.py [modify] https://crrev.com/5524af6018c603cd943c8f76ade1e407a5459ad7/third_party/blink/tools/blinkpy/web_tests/run_webkit_tests.py
,
Oct 18
dpranke from https://chromium-review.googlesource.com/c/chromium/src/+/1273810#message-6fb4cdeb0abfee75b15f107d6f9fe4b4f105a09a """ What's the relative performance difference between --reset-shell-between-tests and completely restarting content_shell? Maybe the latter isn't that much slower, and we should only do that? """ Super rough metrics from 1 CQ run of all tests: """ ToT. Usually reuses content shell + renderer between test runs. Max shard duration: Linux: 14 win7: 17 macOS: 16 Restart content shell between every test run. Max shard duration: linux: 23 win7: 31 macOS: 63 Restart renderer between every test run. Max shard duration: win7:30 linux: 30 macOS:78 """ This suggests that content shell restart has comparable performance to renderer restart, and is simpler. I'll migrate over to that mechanism. I also plan to add support for content shell restart between retries [when running all tests]. This will reduce the impact of flakiness of false rejects.
,
Oct 18
Interesting that MacOS is so much worse than Windows; I would've expected the opposite. Any theories as to why that might be?
,
Oct 23
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d3fdbbf10ab38c1bd81ebbe12aab5cd374bb2a81 commit d3fdbbf10ab38c1bd81ebbe12aab5cd374bb2a81 Author: erikchen <erikchen@chromium.org> Date: Tue Oct 23 16:55:14 2018 Restart content shell before test retries when running all tests. If we're retrying a test, then it's because we think it might be flaky and rerunning it might provide a different result. Restarting content shell prevents state from leaking from previous tests. This CL does not apply to '--gtest_repeat'. That will be modified to use this same restart mechanism in a future CL. This CL slightly changes the implementation of "batch-size". Previously, the implementation relied on the assumption that "batch-size" never changed and restarted content shell after completion of the test. This CL makes "batch-size" a variable. In order to handle the case where "batch-size" is reduced [e.g. from 5 to 1, when 2 tests have been run in the current batch], this CL moves the restart of content shell to occur before the test is run, rather than after. Bug: 889036 Change-Id: Ibe28474033ea1fb1e67b9c4b8320bdefe95331ad Reviewed-on: https://chromium-review.googlesource.com/c/1288974 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: Ned Nguyen <nednguyen@google.com> Commit-Queue: Erik Chen <erikchen@chromium.org> Cr-Commit-Position: refs/heads/master@{#601989} [modify] https://crrev.com/d3fdbbf10ab38c1bd81ebbe12aab5cd374bb2a81/third_party/blink/tools/blinkpy/web_tests/controllers/layout_test_runner.py [modify] https://crrev.com/d3fdbbf10ab38c1bd81ebbe12aab5cd374bb2a81/third_party/blink/tools/blinkpy/web_tests/controllers/single_test_runner.py [modify] https://crrev.com/d3fdbbf10ab38c1bd81ebbe12aab5cd374bb2a81/third_party/blink/tools/blinkpy/web_tests/port/driver.py [modify] https://crrev.com/d3fdbbf10ab38c1bd81ebbe12aab5cd374bb2a81/third_party/blink/tools/blinkpy/web_tests/port/test.py
,
Oct 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/b85d6c559797ed37a9586f17e9c880f896611ae9 commit b85d6c559797ed37a9586f17e9c880f896611ae9 Author: erikchen <erikchen@chromium.org> Date: Tue Oct 30 14:38:16 2018 Repeat gtests and webkit_layout_tests 10 times in 'retry with patch'. This will reduce false rejects caused by flakiness in test suites. This is particularly important for webkit_layout_tests, as it uses a slower [but more reliable] retry mechanism when --gtest_repeat is set. Bug: 889036 Change-Id: I64d66ecd7f3123b6fd37a059bcc23f9ea995ed0e Reviewed-on: https://chromium-review.googlesource.com/c/1294489 Auto-Submit: Erik Chen <erikchen@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Commit-Queue: John Budorick <jbudorick@chromium.org> [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipes/chromium_trybot.expected/invalid_results.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/nonzero_exit_code_no_gtest_output.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/enable_retry_with_patch_invalid_test_results.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipes/chromium_trybot.expected/swarmed_webkit_tests_interrupted.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/steps.py [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipes/chromium_trybot.expected/swarmed_webkit_tests_unexpected_error.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/disable_deapply_patch_affected_files.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/steps/local_gtest_test.expected/android.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/steps/local_gtest_test.expected/retry.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/enable_retry_with_patch_recipes.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipes/chromium_trybot.expected/swarming_test_failure.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/steps/local_gtest_test.expected/basic.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipe_modules/chromium_tests/tests/api/run_tests_on_tryserver.expected/basic.json [modify] https://crrev.com/b85d6c559797ed37a9586f17e9c880f896611ae9/scripts/slave/recipes/chromium_trybot.expected/swarmed_layout_tests_too_many_failures_for_retcode.json
,
Nov 26
According to go/top-cq-flakes, webkit_layout_tests failures are no longer very flaky. If a build fails with a webkit_layout_test failure, the retry almost always also has a webkit_layout_test failure. webkit_layout_tests (with patch) is ranked at #12 [down from top 3], and webkit_layout_tests (retry with patch) isn't present at all. In the last week, there were a single digit number of instances where webkit_layout_tests failures pass on retry. I have not yet investigated all of them. The first two I looked at had the same cause: a test was recently enabled which flakes <1% of the time, but fails 100% of the time when run standalone. When we retry the whole build, the test passes since it's run in a batch and not standalone. I'm disabling that test. https://bugs.chromium.org/p/chromium/issues/detail?id=908517#c3 Overall, I'm quite happy with the low rate of test suite flakiness in webkit_layout tests. Individual tests are still flaky, but they no longer cause CLs to be incorrectly marked as failures. I'm closing this bug.
,
Nov 26
Woo! |
|||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by dpranke@chromium.org
, Sep 25