New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 598406 link

Starred by 3 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Jul 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocking:
issue 506227
issue 631087



Sign in to add a comment

disable CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator

Project Member Reported by sorin@chromium.org, Mar 28 2016

Issue description

This has has been found flaky on https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29

It is unclear which commit introduced this flakiness.
The first occurrence (last 200 builds from present) happened on
Builder Win7 Tests (dbg)(1) Build #47103
Revision	c84d04ac9fdf9752bd3ea79180df7ba5cca2a6cb
Start	Tue Mar 22 10:17:06 2016

It is been failing sporadically on this bot, last seen at:
https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29/builds/47278
Revision	9011b1f0bf8dbf920a95bc32f9ba7c01f0d6c28f
Mon Mar 28 12:19:54 2016

Retrying 1 test (retry #1)
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\124C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_19541\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\124C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_19541\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\124C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_19541\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
[ RUN      ] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator

[16062/16062] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (TIMED OUT)
Retrying 1 test (retry #2)
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\C22C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_13300\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\C22C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_13300\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\C22C.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_13300\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
[ RUN      ] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator

[16063/16063] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (TIMED OUT)
Retrying 1 test (retry #3)
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\71FC.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_25417\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\71FC.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_25417\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
Still waiting for the following processes to finish:
	"..\out\Debug\net_unittests.exe" --brave-new-test-launcher --gtest_flagfile="C:\Users\chrome-bot\AppData\Local\Temp\scoped_dir3528_24217\71FC.tmp" --single-process-tests --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\3528_25417\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\work\isolated\out84yxvy\output.json"
[ RUN      ] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator

[16064/16064] CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (TIMED OUT)
1 test timed out:
    CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator (c:\b\build\slave\win_builder__dbg_\build\src\net\cert\ct_log_verifier_unittest.cc:490)
Tests took 962 seconds.
Still waiting for the following processes to finish:
Additional test environment:
    CHROME_DEVEL_SANDBOX=/opt/chromium/chrome_sandbox
    LANG=en_US.UTF-8
Command: ..\out\Debug\net_unittests.exe --brave-new-test-launcher --test-launcher-bot-mode --test-launcher-summary-output=e:\b\swarm_slave\work\isolated\out84yxvy\output.json







 
Components: Internals>Network>CertTrans Internals>Network>SSL
Owner: eranm@chromium.org
Eh, I trust bots less than I trust this test, all things considered. For all we know, it could be fallout from VS2015, given how 'simple' this test is.

But setting to eran for triage.

Comment 2 by eranm@chromium.org, Mar 28 2016

Can the time out be extended for this test?
I am well aware that this test is slow. It is computationally intensive as it builds Merkle trees up to 256 nodes and tries all possible combinations for proofs.
It does not time out because it's waiting for something to happen. I have a lot of confidence in this test and would rather not disable it.

Comment 3 by sorin@chromium.org, Mar 28 2016

I don't know how to extend the timeout for this test, that would be something to discuss for the troopers or chrome-infra people. The proposed change disables the test just for the WINOS where we see the flakiness. Since the code appears to be platform neutral, this can give the needed coverage while working out the timeout issues.
Eran: In general, we try not to have tests that run as long as yours. I see in one pass it succeeds in 51979ms (perhaps https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29/builds/47278/steps/net_unittests%20on%20Windows-7-SP1/logs/stdio is a bug - it indicates it fails, but the command-line output indicates it succeeds), but the timeouts were 5 minutes, and if there's a single test taking that long, that's already bad.

Even having a test run in several seconds is generally considered 'poor form'. I'd like to rethink this test and its computation.
Project Member

Comment 5 by bugdroid1@chromium.org, Mar 28 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7fc311e2a286633106ee1ce82a7be9ed56e910d1

commit 7fc311e2a286633106ee1ce82a7be9ed56e910d1
Author: sorin <sorin@chromium.org>
Date: Mon Mar 28 21:32:48 2016

disable CTLogVerifierTest.VerifiesValidConsistencyProofsFromReferenceGenerator

The test has been found flaky on this bot:
https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29

This has started on Mar 22 10:17	c84d04ac9fdf9752bd3ea79180df7ba5cca2a6cb	failure	#47103
Failed steps failed browser_tests on windows-7-sp1 failed net_unittests on windows-7-sp1

Which is the earliest occurrence within the most recent 200 builds to date.

TBR=rsleevi@chromium.org
BUG= 598406 

Review URL: https://codereview.chromium.org/1837013002

Cr-Commit-Position: refs/heads/master@{#383568}

[modify] https://crrev.com/7fc311e2a286633106ee1ce82a7be9ed56e910d1/net/cert/ct_log_verifier_unittest.cc

Eran: Questions to ask:

1) Is this test necessary? What functionality is it exercising?
2) Could we accomplish this test with fewer nodes? For example, 5? Why or why not?
3) This test computes a tree using one method, and then verifies via another. Could we check in a reference tree to be verified? Why or why not?
4) Are there perhaps opportunities to optimize this code to better improve testing?
Blocking: 506227
Components: -Internals>Network>SSL
Status: Assigned (was: Untriaged)
I'd like to follow up on this, since I'm currently adding inclusion/audit proof checking to CTLogVerifier. If I just adapt the tests from https://github.com/google/certificate-transparency/blob/c41b090ecc14ddd6b3531dc7e5ce36b21e253fdd/cpp/merkletree/merkle_tree_test.cc, as Eran did for the consistency proof tests, then I'll be introducing a second test that takes a similarly long time, which seems like a bad idea. I could combine the two tests, to save on tree building time, but that only avoids worsening the issue. Instead, if we could fix this issue with the existing tests, then I should be able to incorporate the new tests in the same way.

Eran, did you have any thoughts on Ryan's questions? Here's my two cents:
1) The test isn't absolutely necessary, as we could settle for just checking the edge cases (see https://github.com/google/certificate-transparency/blob/master/python/ct/crypto/merkle_test.py). However, it does make us more confident in the correctness of the implementation.
2) We're adverse to reducing the number of nodes below 256, as we've previously had tests that used fewer nodes and subsequently discovered a bug that only appeared with a larger number. The Python tests linked above do settle for testing only trees up to 4 nodes in size though. It's a question of how confident we want to be.
3) This test generates 256 distinct trees, so we can't check in just a single reference tree instead. Some of them have identical subtrees though.
4) I think this code could be optimized quite a bit. For example, it unnecessarily re-hashes portions of the trees. I've quickly tested caching the result of calling ReferenceMerkleTreeHash() and reduced the test duration by 50%.

Comment 8 by eranm@chromium.org, Jul 12 2016

To answer Ryan's questions from comment 6:
(1) Yes, this test is necessary to give us confidence that the implementation we have in Chromium of the consistency check algorithm is correct.
You could argue that this is an implementation of an algorithm that's not unique to Chrome and so is tested elsewhere, but that will not provide safety against changes to the Chromium implementation (My view is that tests are there as a constant safety net for all kinds of code changes, both major and minor "safe" refactorings).

(2) There have been bugs that were not uncovered by tests using a smaller tree (to do with edge cases of the sizes of the trees we check consistency between.). 

(3) It generates 128 Merkle trees - we could check in all of them, but Rob may have a more elegant solution (see next item).

(4) Yes, Rob has a few suggestions - he'll detail them in the bug before implementing.

What I'd like to understand better is why is it a problem that this test is taking a long time.
- Is it the flakiness it introduces? In that case, can we increase the timeout for this particular test?
- Is it that it slows down net_unittests, slowing down development flow of code/run/test? In that case, how about breaking it out to a separate binary?
- Is it processing time on the trybots that's expensive? If so, running it on just a handful of bots should be enough as this code is platform-independent. 

I'd like to find a compromise that will address as many of the scenarios where long test runtime is a problem while having the test run at least on some scenarios. 

Comment 9 by eranm@chromium.org, Jul 12 2016

Cc: robpercival@chromium.org
Blocking: 631087
Labels: -Pri-3 Pri-2
Owner: robpercival@chromium.org
Status: Started (was: Assigned)
Lacking responses to Eran's questions, I've gone ahead and tested a couple of options. The first was a quick and dirty implementation of hash caching (https://codereview.chromium.org/2178153003/), which reduced the test duration by 
50%.  The second was converting the long-running test into a value-parameterized test (https://codereview.chromium.org/2187043004/), which splits it into 8256 test cases each taking ~3ms. This increases the overall test duration by ~85%.

It may be possible to combine these two approaches, in order to at least offset some of the reduction in performance. There may be performance gains to be had by optimising the implementation of the parameterized tests as well.
Sorry, I'd missed the questions.

For context, and further reference:
- We don't want to increase test durations. We don't have a way to do this per-test, but in general, the longer a single test runs, the more problematic it is for the overall infrastructure.
- Splitting tests up generally works well. On our production infrastructure, tests are distributed via swarming, which allows content-addressible-storage of build artifacts and the ability to shard out tests to many slaves. The more monolithic a single test is, the more time a given slave will work. Splitting one test into two often means that these tests can potentially run in parallel, which results in a net-win, even if running them synchronously results in a net-loss.
- Adding new test binaries is significantly more expensive, because it requires wiring up all the build infrastructure to know about a new test harness, its build artifacts, it's environmental needs, etc. With the exception of fuzzers, splitting into new binaries is a net loss. It also doesn't generally improve the edit/compile/test cycle, since you still want to ensure all components are appropriately tested - at best, it adds to cognitive overhead for developers to know they need to run that test binary.


From a design philosophy of tests, we shouldn't be trying to design tests that are turing complete or which test all possible permutations of possible inputs. We want to ensure complete coverage though - that all edge cases are reasonably exercised by tests, and state dependencies are tracked. I'm still concerned that this test represents an overly broad set of tests, and that it'd be sufficient to ensure we had cyclomatric coverage, but in the absence of public coverage bots, that's not something I'm going to argue too hard for.
Project Member

Comment 13 by bugdroid1@chromium.org, Jul 29 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/844c3d233e4b9a47cb1d8a5b2f60811200a8310c

commit 844c3d233e4b9a47cb1d8a5b2f60811200a8310c
Author: robpercival <robpercival@chromium.org>
Date: Fri Jul 29 00:44:03 2016

Makes an expensive CTLogVerifier test into a value-parameterized test

As recommended by Ryan Sleevi in https://codereview.chromium.org/2178153003/,
this turns one test case with nested loops into a value-parameterized test.
This avoids having a single test case that runs for ~13s in exchange for
128 test cases which each take ~1-350ms each (proportional to the tree size). It reduces the chance of the test case
timing out, but increases the overall test duration by ~3s.

BUG= 598406 

Review-Url: https://codereview.chromium.org/2187043004
Cr-Commit-Position: refs/heads/master@{#408534}

[modify] https://crrev.com/844c3d233e4b9a47cb1d8a5b2f60811200a8310c/net/cert/ct_log_verifier_unittest.cc

Labels: M-54
Status: Verified (was: Started)

Sign in to add a comment