site_per_process_webkit_layout_tests flaky on linux_chromium_rel_ng |
|||||||
Issue descriptionTwice in a row, different tests in site_per_process_webkit_layout_tests failed on an unrelated change. Failed runs: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng/60195 https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng/60288 Marking as P1 since it's affecting the CQ.
,
Apr 4 2018
dpranke@ - is there a way to compare flakiness rates between 1) site_per_process_webkit_layout_tests and 2) webkit_layout_tests steps? Depending on such comparison we may either try to try some site-per-process-wide things VS just falling back to the normal sheriffing stuff of opening bugs and disabling flaky tests.
,
Apr 4 2018
Not easily. We can point you at the BQ tables for you to run queries, I think, but that'll take some hand holding. +wylieb@ - maybe you can help lukasza@ figure out how we'd look at this?
,
Apr 4 2018
I added your @google to the test-results-hrd table. You would probably know what constitutes a flake for or failure for your specific needs. My suggestion would probably be to look at the failure rate, since that's much more obvious to calculate. If your tests are flaking out often, then the highest failure rate might just yield your flaky tests. Take a peek at the schema, and reach out to me with any questions about it. Some tidbits that have helped me. 1. Flip the bit for standard SQL (you can find the switch under 'Show Options' after you compose a query). 2. Use _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 day) to make you queries a bit faster. Here's the storage we use (you should have access): https://bigquery.cloud.google.com/results/test-results-hrd Here's a tiny query to get you started: SELECT step_name, run.name as test_name, run.actual as results FROM `test-results-hrd.events.test_results` WHERE step_name like '%site_per_process_webkit_layout_tests%' AND ARRAY_LENGTH(run.actual) = 4 AND _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 day)
,
Apr 5 2018
I've run the following query twice - 1) once filtering for step_name like 'webkit_layout_tests%' and 2) once filtering for step_name like 'site_per_process_webkit_layout_tests%'
SELECT
COUNT(*),
(ARRAY_LENGTH(run.actual) > 1)
FROM `test-results-hrd.events.test_results`
WHERE
step_name like 'webkit_layout_tests%'
AND buildbot_info.builder_name = 'linux_chromium_rel_ng'
AND _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 day)
GROUP BY
(ARRAY_LENGTH(run.actual) > 1);
Results:
+----------------------------------------+---------------------+--------------------------------------+
| | webkit_layout_tests | site_per_process_webkit_layout_tests |
+----------------------------------------+---------------------+--------------------------------------+
| FLAKES = ARRAY_LENGTH(run.actual) > 1 | 188086 | 290361 |
| NORMAL = ARRAY_LENGTH(run.actual) <= 1 | 206508484 | 221670504 |
| FLAKES / NORMAL as percentage | 0.09% | 0.13% |
+----------------------------------------+---------------------+--------------------------------------+
Based on that, there is indeed a slight increase in the flakiness rate in site_per_process_webkit_layout_tests, but the magnitude of flakiness seems comparable across site_per_process_webkit_layout_tests and webkit_layout_tests. Therefore, I think there is no point in trying to evaluate flakiness/stability of site-per-process runs, but we should just focus on individual flaky tests (having tree sheriffs open bugs and disable the tests).
Thoughts?
,
Apr 5 2018
Based on #c5 this also doesn't seem like a P1.
,
Apr 11 2018
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Linux%20Tests looks like it's been pretty red today from site_per_process_webkit_layout_tests. These more recent failures seem like timeouts in mostly like compositing-heavy tests (filters, blurs, webgl). I wonder if something's gotten slower in the graphics pipeline somewhere that's fallen off the timeout cliff? Issue 831378 is maybe a dupe of this, although it also lists webkit_layout_tests. I've seen those failing similar timeouts.
,
Apr 11 2018
,
Aug 1
,
Nov 1
I think there are no known, broad sources of flakiness in site_per_process_webkit_layout_tests anymore. The last known issue was fixed in r584841. Please open a new bug if there is any more trouble from site_per_process_webkit_layout_tests (note that in https://crrev.com/c/1302662 I plan to make site-per-process the default). |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by creis@chromium.org
, Apr 3 2018Owner: lukasza@chromium.org