Layout tests seem more flaky with site-per-process |
||||||
Issue descriptionThis is a follow-up to the issues raised in https://groups.google.com/a/chromium.org/d/topic/chromium-dev/cIycVUIowzU/discussion I tried to look at flakiness dashboard for site_per_process_webkit_layout_tests step: 1) https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=site_per_process_webkit_layout_tests&sortColumn=slowest and compare with flakiness dashboard for webkit_layout_tests: 2) https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_layout_tests&showFlaky=true&builder=chromium.linux%3ALinux%20Tests AFAICT, the dashboard reports 3958 flaky tests with site-per-process and only 185 flaky tests without site-per-process - please see the attached results.
,
Aug 16
Many tests seem to be slow (and possibly falling off a timeout cliff with site-per-process?): $ cat ~/scratch/layout-tests-flaky-with-site-per-process | grep '[[:space:]][0-9][0-9]s$' | wc -l 84 $ cat ~/scratch/layout-tests-flaky-with-site-per-process | grep '[[:space:]][5-9]s$' | wc -l 617
,
Aug 16
FWIW, I assume that I am comparing apples-to-apples here (e.g. despite the fact that the first URL doesn't explicitly specify builder= or showFlaky=, both URLs open dashboards that seem to be restricted to 1) "chromium.linux:Linux Tests" waterfall bot and 2) flaky tests).
,
Aug 16
+stgao / liaoyuke - you should be able to confirm this pretty easily with infra data, right? Back In The Day, I had a rule of thumb that any test that took longer than 1 second to run should probably be marked as Slow, because the variance in test times we'd see on the bots might lead to some timeouts. (And this was definitely true for things slower than 2-3 seconds). I expect that we've had a lot of tests either get added or get slower where we haven't done this, and it's possible that site-per-process takes an already oversubscribed bot (due to running too many content_shells in parallel) and *way* oversubscribes it, making things too slow. We could test this by either increasing the timeout values on the tests, or reducing the amount of parallelism (with --jobs values < 8) and seeing if that helped.
,
Aug 16
,
Aug 16
Below are data for flaky tests that caused retries of CQ builds/attempts from May 1 to July 31. I haven't checked those hidden flakes yet -- the tests passed after 2 retries. And I didn't check data in Aug yet. Based on the data, my preliminary conclusion is: site_per_process_webkit_layout_tests has less flake occurrences, but more tests of low flakiness. 1. No special filtering: test_target total_flake_occurrences distinct_flaky_tests webkit_layout_tests 6242 1791 site_per_process_webkit_layout_tests 4322 3034 2. Ignore tests that had only one flake occurrence test_target total_flake_occurrences distinct_flaky_tests webkit_layout_tests 857 5308 site_per_process_webkit_layout_tests 331 1619 3. Ignore tests that had more than one flake occurrence test_target total_flake_occurrences distinct_flaky_tests site_per_process_webkit_layout_tests 2703 2703 webkit_layout_tests 934 934 For those who would like to play with the data, here is the query to start with. I could handle permission issue there. https://pantheon.corp.google.com/bigquery?project=findit-for-me&folder&organizationId=433637338589&j=bquxjob_2f70f9c5_16540abfd9b&page=queryresults
,
Aug 16
A handful of tests with slowest_run >= 10s are already present in third_party/WebKit/LayoutTests/SlowTests, but most aren't - let me put together a CL that fixes this. # Missing from SlowTests: $ for i in `cat ~/scratch/layout-tests-flaky-with-site-per-process | grep '[[:space:]][0-9][0-9]s$' | cut -f 1`; do if ! grep -q "$i" third_party/WebKit/LayoutTests/SlowTests; then echo $i; fi; done | wc -l 72 # Already present in SlowTests: ~/src/chromium3/src on spinner-of-dead-navigation $ for i in `cat ~/scratch/layout-tests-flaky-with-site-per-process | grep '[[:space:]][0-9][0-9]s$' | cut -f 1`; do if grep -q "$i" third_party/WebKit/LayoutTests/SlowTests; then echo $i; fi; done | wc -l 12 WIP CL: https://crrev.com/c/1178226
,
Aug 16
We should think about automating keeping SlowTests up to date, just like we have w/ flaky tests.
,
Aug 17
,
Aug 17
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/81862268bd1fca446fcad20865eaa9ab64fad14f commit 81862268bd1fca446fcad20865eaa9ab64fad14f Author: Lukasz Anforowicz <lukasza@chromium.org> Date: Fri Aug 17 16:01:51 2018 Use flakiness dashboard snapshot to identify layout tests slower than 3s This CL has been put together by 1. Going to flakiness dashboard for site_per_process_webkit_layout_tests (see the links at the top of https://crbug.com/874695) 2. Grabbing flaky tests with slowest_run >= 3 and adding them to SlowTests (unless they've been already present) Bug: 874695 Change-Id: I96c7befd3a654b1be8291921c51d539b5f6fbfb8 Reviewed-on: https://chromium-review.googlesource.com/1178226 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Commit-Queue: Ćukasz Anforowicz <lukasza@chromium.org> Cr-Commit-Position: refs/heads/master@{#584088} [modify] https://crrev.com/81862268bd1fca446fcad20865eaa9ab64fad14f/third_party/WebKit/LayoutTests/SlowTests
,
Aug 17
Status update: - I am still struggling with https://crrev.com/c/1178465 - having a test in SlowTests is not sufficient to get rid of flaky timeouts. I've opened issue 875430 to track this aspect of the problem. - I've opened issue 875419 to follow-up on dpranke@'s suggestion from #c8 to automate updating LayoutTests/SlowTests
,
Aug 20
One source of flakiness has been identified in issue 834185 . Let's revisit after this issue gets fixed.
,
Aug 21
,
Aug 31
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by lukasza@chromium.org
, Aug 16$ cat ~/scratch/layout-tests-flaky-with-site-per-process | cut -f 1 | cut -f 1-2 -d '/' | sort | uniq -c | sort -n | tail -10 17 virtual/layout_ng 23 css3/filters 48 virtual/gpu 56 virtual/video-surface-layer 138 virtual/threaded 158 virtual/outofblink-cors 167 virtual/outofblink-cors-ns 307 virtual/layout_ng_experimental 736 http/tests 1939 external/wpt