disable CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator |
||||||
Issue descriptionThis has has been found flaky on https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29 It is unclear which commit introduced this flakiness. The first occurrence (last 200 builds from present) happened on Builder Win7 Tests (dbg)(1) Build #47103 Revision c84d04ac9fdf9752bd3ea79180df7ba5cca2a6cb Start Tue Mar 22 10:17:06 2016 It is been failing sporadically on this bot, last seen at: https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29/builds/47278 Revision 9011b1f0bf8dbf920a95bc32f9ba7c01f0d6c28f Mon Mar 28 12:19:54 2016 Retrying 1 test (retry #1) Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\124C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_19541\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\124C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_19541\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\124C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_19541\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" [ RUN ] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator [16062/16062] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (TIMED OUT) Retrying 1 test (retry #2) Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\C22C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_13300\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\C22C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_13300\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\C22C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_13300\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" [ RUN ] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator [16063/16063] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (TIMED OUT) Retrying 1 test (retry #3) Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\71FC.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_25417\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\71FC.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_25417\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" Still waiting for the following processes to finish: "..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\71FC.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_25417\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json" [ RUN ] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator [16064/16064] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (TIMED OUT) 1 test timed out: CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (c:\b\build\slave\win_builder__dbg_\build\src\net\cert\ct_log_verifier_unittest.cc:490) Tests took 962 seconds. Still waiting for the following processes to finish: Additional test environment: CHROME_DEVEL_SANDBOX=/opt/chromium/chrome_sandbox LANG=en_US.UTF-8 Command: ..\out\Debug\net_unittests.exe --brave-new-test-launcher --test-launcher-bot-mode --test-launcher-summary-output=e:\b\swarm_slave\work\isolated\out84yxvy\output.json
,
Mar 28 2016
Can the time out be extended for this test? I am well aware that this test is slow. It is computationally intensive as it builds Merkle trees up to 256 nodes and tries all possible combinations for proofs. It does not time out because it's waiting for something to happen. I have a lot of confidence in this test and would rather not disable it.
,
Mar 28 2016
I don't know how to extend the timeout for this test, that would be something to discuss for the troopers or chrome-infra people. The proposed change disables the test just for the WINOS where we see the flakiness. Since the code appears to be platform neutral, this can give the needed coverage while working out the timeout issues.
,
Mar 28 2016
Eran: In general, we try not to have tests that run as long as yours. I see in one pass it succeeds in 51979ms (perhaps https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29/builds/47278/steps/net_unittests%20on%20Windows-7-SP1/logs/stdio is a bug - it indicates it fails, but the command-line output indicates it succeeds), but the timeouts were 5 minutes, and if there's a single test taking that long, that's already bad. Even having a test run in several seconds is generally considered 'poor form'. I'd like to rethink this test and its computation.
,
Mar 28 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7fc311e2a286633106ee1ce82a7be9ed56e910d1 commit 7fc311e2a286633106ee1ce82a7be9ed56e910d1 Author: sorin <sorin@chromium.org> Date: Mon Mar 28 21:32:48 2016 disable CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator The test has been found flaky on this bot: https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29 This has started on Mar 22 10:17 c84d04ac9fdf9752bd3ea79180df7ba5cca2a6cb failure #47103 Failed steps failed browser_tests on windows-7-sp1 failed net_unittests on windows-7-sp1 Which is the earliest occurrence within the most recent 200 builds to date. TBR=rsleevi@chromium.org BUG= 598406 Review URL: https://codereview.chromium.org/1837013002 Cr-Commit-Position: refs/heads/master@{#383568} [modify] https://crrev.com/7fc311e2a286633106ee1ce82a7be9ed56e910d1/net/cert/ct_log_verifier_unittest.cc
,
Mar 28 2016
Eran: Questions to ask: 1) Is this test necessary? What functionality is it exercising? 2) Could we accomplish this test with fewer nodes? For example, 5? Why or why not? 3) This test computes a tree using one method, and then verifies via another. Could we check in a reference tree to be verified? Why or why not? 4) Are there perhaps opportunities to optimize this code to better improve testing?
,
Jul 8 2016
I'd like to follow up on this, since I'm currently adding inclusion/audit proof checking to CTLogVerifier. If I just adapt the tests from https://github.com/google/certificate-transparency/blob/c41b090ecc14ddd6b3531dc7e5ce36b21e253fdd/cpp/merkletree/merkle_tree_test.cc, as Eran did for the consistency proof tests, then I'll be introducing a second test that takes a similarly long time, which seems like a bad idea. I could combine the two tests, to save on tree building time, but that only avoids worsening the issue. Instead, if we could fix this issue with the existing tests, then I should be able to incorporate the new tests in the same way. Eran, did you have any thoughts on Ryan's questions? Here's my two cents: 1) The test isn't absolutely necessary, as we could settle for just checking the edge cases (see https://github.com/google/certificate-transparency/blob/master/python/ct/crypto/merkle_test.py). However, it does make us more confident in the correctness of the implementation. 2) We're adverse to reducing the number of nodes below 256, as we've previously had tests that used fewer nodes and subsequently discovered a bug that only appeared with a larger number. The Python tests linked above do settle for testing only trees up to 4 nodes in size though. It's a question of how confident we want to be. 3) This test generates 256 distinct trees, so we can't check in just a single reference tree instead. Some of them have identical subtrees though. 4) I think this code could be optimized quite a bit. For example, it unnecessarily re-hashes portions of the trees. I've quickly tested caching the result of calling ReferenceMerkleTreeHash() and reduced the test duration by 50%.
,
Jul 12 2016
To answer Ryan's questions from comment 6: (1) Yes, this test is necessary to give us confidence that the implementation we have in Chromium of the consistency check algorithm is correct. You could argue that this is an implementation of an algorithm that's not unique to Chrome and so is tested elsewhere, but that will not provide safety against changes to the Chromium implementation (My view is that tests are there as a constant safety net for all kinds of code changes, both major and minor "safe" refactorings). (2) There have been bugs that were not uncovered by tests using a smaller tree (to do with edge cases of the sizes of the trees we check consistency between.). (3) It generates 128 Merkle trees - we could check in all of them, but Rob may have a more elegant solution (see next item). (4) Yes, Rob has a few suggestions - he'll detail them in the bug before implementing. What I'd like to understand better is why is it a problem that this test is taking a long time. - Is it the flakiness it introduces? In that case, can we increase the timeout for this particular test? - Is it that it slows down net_unittests, slowing down development flow of code/run/test? In that case, how about breaking it out to a separate binary? - Is it processing time on the trybots that's expensive? If so, running it on just a handful of bots should be enough as this code is platform-independent. I'd like to find a compromise that will address as many of the scenarios where long test runtime is a problem while having the test run at least on some scenarios.
,
Jul 12 2016
,
Jul 25 2016
,
Jul 28 2016
Lacking responses to Eran's questions, I've gone ahead and tested a couple of options. The first was a quick and dirty implementation of hash caching (https://codereview.chromium.org/2178153003/), which reduced the test duration by 50%. The second was converting the long-running test into a value-parameterized test (https://codereview.chromium.org/2187043004/), which splits it into 8256 test cases each taking ~3ms. This increases the overall test duration by ~85%. It may be possible to combine these two approaches, in order to at least offset some of the reduction in performance. There may be performance gains to be had by optimising the implementation of the parameterized tests as well.
,
Jul 28 2016
Sorry, I'd missed the questions. For context, and further reference: - We don't want to increase test durations. We don't have a way to do this per-test, but in general, the longer a single test runs, the more problematic it is for the overall infrastructure. - Splitting tests up generally works well. On our production infrastructure, tests are distributed via swarming, which allows content-addressible-storage of build artifacts and the ability to shard out tests to many slaves. The more monolithic a single test is, the more time a given slave will work. Splitting one test into two often means that these tests can potentially run in parallel, which results in a net-win, even if running them synchronously results in a net-loss. - Adding new test binaries is significantly more expensive, because it requires wiring up all the build infrastructure to know about a new test harness, its build artifacts, it's environmental needs, etc. With the exception of fuzzers, splitting into new binaries is a net loss. It also doesn't generally improve the edit/compile/test cycle, since you still want to ensure all components are appropriately tested - at best, it adds to cognitive overhead for developers to know they need to run that test binary. From a design philosophy of tests, we shouldn't be trying to design tests that are turing complete or which test all possible permutations of possible inputs. We want to ensure complete coverage though - that all edge cases are reasonably exercised by tests, and state dependencies are tracked. I'm still concerned that this test represents an overly broad set of tests, and that it'd be sufficient to ensure we had cyclomatric coverage, but in the absence of public coverage bots, that's not something I'm going to argue too hard for.
,
Jul 29 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/844c3d233e4b9a47cb1d8a5b2f60811200a8310c commit 844c3d233e4b9a47cb1d8a5b2f60811200a8310c Author: robpercival <robpercival@chromium.org> Date: Fri Jul 29 00:44:03 2016 Makes an expensive CTLogVerifier test into a value-parameterized test As recommended by Ryan Sleevi in https://codereview.chromium.org/2178153003/, this turns one test case with nested loops into a value-parameterized test. This avoids having a single test case that runs for ~13s in exchange for 128 test cases which each take ~1-350ms each (proportional to the tree size). It reduces the chance of the test case timing out, but increases the overall test duration by ~3s. BUG= 598406 Review-Url: https://codereview.chromium.org/2187043004 Cr-Commit-Position: refs/heads/master@{#408534} [modify] https://crrev.com/844c3d233e4b9a47cb1d8a5b2f60811200a8310c/net/cert/ct_log_verifier_unittest.cc
,
Jul 29 2016
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by rsleevi@chromium.org
, Mar 28 2016Owner: eranm@chromium.org