URLRequestQuicTest.CancelPushIfCached & CancelPushIfCached2 are flakey on Fuchsia |
||||||||
Issue description
[ RUN ] URLRequestQuicTest.CancelPushIfCached2
../../net/url_request/url_request_quic_unittest.cc:393: Failure
Value of: entries[1].GetStringValue("push_url", &value)
Actual: false
Expected: true
../../net/url_request/url_request_quic_unittest.cc:394: Failure
Expected: value
Which is: "https://test.example.com/kitten-1.jpg"
To be equal to: push_url_2
Which is: "https://test.example.com/favicon.ico"
[ FAILED ] URLRequestQuicTest.CancelPushIfCached2 (1900 ms)
https://build.chromium.org/p/chromium.fyi/builders/Fuchsia/builds/10224
Doesn't seem to be frequent.
,
Dec 4 2017
,
Dec 4 2017
Note that issue 787285 was filed for one of these tests also failing on Windows.
,
Dec 4 2017
+zhongyi any idea why this is flaky? Is there some inherent racing, by any chance?
,
Dec 4 2017
,
Dec 4 2017
I don't have a quick answer, we used to have no racing but I wonder whether it could be related to Shivani's recent change to HttpCache as we now allow parallel cache access. +shivanisha, could you take a look?
,
Dec 4 2017
Re #6: Does the parallel cache access mean that we may now get re-ordering of the NetLog'd events? FWIW, I'm adding some more ASSERT/EXPECT clauses to the tests, to get more detail on what is happening - will send a CL shortly. While working on that, I've had several instances of the test binary crashing; I wonder if the failures and crashes may be related, and/or whether these crashes are the same as the ones in issue 787285 .
,
Dec 4 2017
The CancelPushIfCached test assumes a very specific sequence of events arising from the two requests: 1. BEGIN for push of 'kitten-1.jpg'. 2. BEGIN for push of 'favicon.ico'. 3. END for push of 'favicon.ico' (with -400 "not found" error, since it is not cached). 4. END for push of 'kitten-1.jpg' (with no parameters, since it should be found). Are the assumptions in this ordering valid? Is there scope for reordering of (3) and (4), or even for (4) to be logged before (2)? e.g. given that we use separate channels per pushed resource, each with their own flow-control, could a packet drop on one cause the receipt order of the pushed resources to change?
,
Dec 4 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/92be9ad2b47957631b1443b24b28b0f70cf82ef4 commit 92be9ad2b47957631b1443b24b28b0f70cf82ef4 Author: Wez <wez@chromium.org> Date: Mon Dec 04 22:30:33 2017 Add more EXPECTations to CancelPushIfCached tests. These tests flake under Fuchsia, suggesting that the ordering of NetLog events assumed by the tests may not always be valid, e.g. due to concurrency in resource delivery over QUIC. The additional EXPECTations will cause more detail to be reported about what was actually NetLog'd. The tests are also renamed to better express their purpose. Bug: 775122 Change-Id: Ia1c8aa898d27db2faafd2e79d705b724a7704843 Reviewed-on: https://chromium-review.googlesource.com/806760 Reviewed-by: Ryan Hamilton <rch@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#521489} [modify] https://crrev.com/92be9ad2b47957631b1443b24b28b0f70cf82ef4/net/url_request/url_request_quic_unittest.cc
,
Dec 5 2017
I wouldn't be surprised with a race in the logs ordering given that the HttpCache::Transaction state machine may return synchronously/asynchronously for some kind of requests and not for others. There have been major changes in the state machine for the parallel cache writing CLs which may have caused this but I am not sure which exact change. Is there a way to refactor the test such that it tests for correctness but does not depend on the ordering so much?
,
Dec 5 2017
Re #10: If you restructured the NetLog extraction to call-out to a mocked method then you could use gtest's in-built Expectation-ordering constraints to manage the expectations. Constructing the necessary matcher rules would be a little more verbose, but it would avoid rolling-your-own means to express the ordering dependencies.
,
Dec 5 2017
Re #11: There weren't any explicit changes in netlog extraction in the parallel writing CLs though. This looks like the test may be dependent on some ordering by the HttpCache::Transaction layer while that may change anytime and thus it should not be dependent.
,
Dec 5 2017
Re #12: The dependencies I had in mind were that (1) END must come after BEGIN, and match the Source type & id, and that (2) (presumably) the first-requested resource BEGINs before the second-requested one.
,
Dec 5 2017
Re #12: It's fine if the netlog entries comes in different order (we used to assume the HttpCache::Transaction will return results in order, which might no longer be true as HttpCache::Transaction allow parallel access) as long as the net log results are correct.
,
Dec 5 2017
Re #14: Yeah, basically we just want to (1) find the transaction Id from the BEGIN for each URL and (2) verify that the END for each transaction has the result that we expect. Should be straightforward to arrange.
,
Dec 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4a9f9e4add59985b998384634f1b010ea33ec0a2 commit 4a9f9e4add59985b998384634f1b010ea33ec0a2 Author: Wez <wez@chromium.org> Date: Wed Dec 06 21:50:39 2017 Refactor QUIC PUSH tests not to assume exact transaction event ordering. Bug: 775122 Change-Id: I3effce43273adbad46c0e085a01790c75427b489 Reviewed-on: https://chromium-review.googlesource.com/810057 Reviewed-by: Ryan Hamilton <rch@chromium.org> Reviewed-by: Shivani Sharma <shivanisha@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#522208} [modify] https://crrev.com/4a9f9e4add59985b998384634f1b010ea33ec0a2/net/url_request/url_request_quic_unittest.cc
,
Dec 6 2017
,
Jan 19 2018
Issue 787285 has been merged into this issue.
,
Jan 19 2018
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by w...@chromium.org
, Dec 4 2017Owner: w...@chromium.org
Status: Assigned (was: Available)
Summary: URLRequestQuicTest.CancelPushIfCached & CancelPushIfCached2 are flakey on Fuchsia (was: URLRequestQuicTest.CancelPushIfCached2 failed in net_unittests on Fuchsia)