Issue metadata
Sign in to add a comment
|
False rejects caused by mini_installer_tests on Windows. Test suite runner needs a built-in retry mechanism. |
||||||||||||||||||||
Issue descriptionCL: https://chromium-review.googlesource.com/c/chromium/src/+/1385566/2 [CL is unrelated] Failing build: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win10_chromium_x64_rel_ng/163385 Succeeding build: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win10_chromium_x64_rel_ng/163441 Test name: __main__.InstallerTest.ChromeUserLevel In the failing build, the test failed twice ['with patch', 'retry with patch', and succeeded once 'without patch']. In the succeeding build, the test failed once ['with patch'] and succeeded twice 'without patch', 'retry with patch' I haven't investigated deeply, but it's possible that the problem is caused by state leaking from the previous task. Example failing tasks: https://chromium-swarm.appspot.com/task?id=41e1e7f9b5727a10&refresh=10&show_raw=1 https://chromium-swarm.appspot.com/task?id=41e212250724e510&refresh=10&show_raw=1 In each of the failing tasks, it seems like test_installer.py is finding a crash at the very beginning of its execution: """ Command: C:\b\s\w\ir\.swarming_module_cache\vpython\9ff274\Scripts\python.exe ..\..\testing\scripts\run_isolated_script_test.py --isolated-script-test-output=C:\b\s\w\iooptvf7\results.json ../../chrome/test/mini_installer/test_installer.py --output-dir=C:\b\s\w\iooptvf7 --isolated-script-test-filter=__main__.InstallerTest.ChromeUserLevel --isolated-script-test-output=C:\b\s\w\iooptvf7\output.json --isolated-script-test-perf-output=C:\b\s\w\iooptvf7\perftest-output.json [1219/173543:test_installer.py(504)] Config found at C:\b\s\w\ir\chrome\test\mini_installer\config\config.config [1219/173543:test_installer.py(542)] Setting --force-clean ChromeUserLevel: no_pv -> install_chrome_user -> chrome_user_installed_not_inuse -> test_chrome_with_chromedriver_user -> chrome_user_installed_not_inuse -> uninstall_chrome_user -> clean ... [1219/173546:test_installer.py(190)] Verifying state no_pv [1219/173546:test_installer.py(157)] Beginning action install_chrome_user [1219/173553:test_installer.py(160)] Finished action install_chrome_user [1219/173553:test_installer.py(190)] Verifying state chrome_user_installed_not_inuse [1219/173553:test_installer.py(157)] Beginning action test_chrome_with_chromedriver_user [1219/173557:test_installer.py(231)] stdout: [1219/173557:test_chrome_with_chromedriver.py(85)] Saved Chrome crash dump to C:\b\s\w\iooptvf7\5f19a176-59f5-4802-9591-d62ddaf91a9a.dmp """ which then causes the first test to fail: """ [1219/165431:test_chrome_with_chromedriver.py(131)] Saved Chrome log to C:\b\s\w\io_rbw2i\tmpvdh7yq Traceback (most recent call last): File "test_chrome_with_chromedriver.py", line 188, in <module> sys.exit(main()) File "test_chrome_with_chromedriver.py", line 183, in main 'The page body was not correct') File "C:\b\s\w\ir\.swarming_module\bin\Lib\contextlib.py", line 24, in __exit__ self.gen.next() File "test_chrome_with_chromedriver.py", line 135, in CreateChromedriver report_count) Exception: Failing test due to 1 crash reports found ERROR """
,
Dec 21
Thanks for pointing this my way. This is a flaky crash in Chrome caught by this test. The DCHECK causing the crash is under discussion in issue 916461. I hope to have it removed very soon (https://chromium-review.googlesource.com/c/chromium/src/+/1384025) so that the flaking stops. I'll dup this into the general "the test is flaky" bug.
,
Dec 21
I think there's actually two separate problems here, so unduping this. 1) There's a flaky test due to a bug in Chrome 2) The test suite does not have retries built in. Please see: https://docs.google.com/document/d/1O9nzVMA6rEe2-rhjsni_8wS9Lw4OnFpK2NE0wWB5VaY/edit# We can never prevent flaky tests from landing on tip of trunk, so in order to lower the impact of flaky tests on false rejects, we *already* have many retry layers. We should try to keep these retries as cheap and efficient as possible. As such, test suite runners should have a retry layer, so that we can skip the more expensive retry layers.
,
Jan 3
Thanks for the pointer. That's an enlightening doc. dpranke: does typ have facilities for retrying failing tests, or does each python test need to implement its own retry mechanism?
,
Jan 3
Ah, I see the word "retry" in the typ source (and --retry-limit), so maybe now is the right time to move this test to using typ (issue 784301). I see that //tools/metrics:metrics_python_tests depends on typ in third_party/catapult/third_party/typ. Is that the right way to use it in other python-based tests?
,
Jan 4
I believe that that is correct. dpranke will know best.
,
Jan 4
typ does have a mechanism for handling failing tests, but it doesn't currently interact properly with mini_installer_tests due to bug 855728. I can fix that (which is actually near the top of my to-do list, finally), and then we should be all set.
,
Jan 5
,
Jan 5
Thanks, Dirk. I took a look at typ and metrics_python_tests, and it looks like test_installer.py will need to use a different strategy to provide the set of tests to typ compared to metrics_python_tests. test_installer.py builds the set of tests to run based on the contents of a configuration file. Does typ have a facility for being given a TestSuite or a list of TestCase instances? |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by erikc...@chromium.org
, Dec 20