hr-time tests are flaky |
||||||
Issue descriptionIn issue 739277 the test hr-time/timeOrigin.html was added to WPT. On #testing jgraham (Mozilla) says: > rbyers, foolip: Tests like https://github.com/w3c/web-platform-tests/pull/6613 suggest that Google's testing infrastructure has better timing consistency than AWS machines Mozilla can use. Asserting that two events happen within 500ms of each other is 100% sure to produce a flaky test. > Which this indeed is. Not falky enough to fail travis, but flaky now we landed it > AM In particular on debug I checked the flakiness dashboard and I see the occasional flake in our infrastructure as well (and I'd expect it to be worse in, eg. ASAN runs): https://test-results.appspot.com/dashboards/flakiness_dashboard.html#showAllRuns=true&testType=webkit_tests&tests=timeOrigin I think we'll need to relax the test not to assume any particular time duration. Eg. it should be possible to break into a debugger at any point and delay execution arbitrarily and still have tests pass. This will probably limit what we can reasonable test about the API, but hopefully we can at least test it's boundaries (greater than one timestamp, less than another) or something? /cc majidvp who has some experience with writing 100% reliable timestamp web-platform-tests.
,
Aug 4 2017
Assuming time durations is not something introduced by this test (example: https://github.com/w3c/web-platform-tests/blob/master/hr-time/window-worker-time-origin.html) and I think it's the best way to test timeOrigin. I think we have to evaluate the numbers and see if it's possible to make it more reliable even assuming slow computations.
,
Aug 4 2017
Is there a specific flaky assert Mozilla is complaining about or is it more than one? I'd think that assert_less_than(workerOrigin - windowOrigin, 500, 'Window and worker timeOrigins should be close.'); should be the flakiest, but there's more than one that would be flaky if we allowed arbitrarily long interruptions.
,
Aug 4 2017
Ah, thanks - I should have checked whether existing tests followed the same pattern. I don't see that test exhibiting flakes in practice on the flakiness dashboard. Obviously we could choose a time that is large enough that it's unlikely to hit in practice even in heavily instrumented builds on slow machines. That's probably the min-bar here. Though in principle, personally I don't think we should depend on any time at all since there's really no value that's 100% correct and the debugging scenario is real and a potential source of pain (eg. makes stepping through code while running the test virtually impossible). I don't know the details of the APIs here, but in general can't we get pretty good coverage just by relying on "must be between" invariants? Eg. the timeOrigin for a worker must be >= than the window's timeOrigin+now before 'new Worker' and <= that on receipt of the message from the worker, right? /cc James for his take. I don't see any guidelines around timing documented in the WPT docs. But this probably falls under the general advice of "test only what is required by the spec". It's perfectly spec conformant for worker creation to take 20 minutes <grin>.
,
Aug 4 2017
Hm yes I think it makes sense to replace some of the arbitrary numbers with performance.now calls. Thanks for the feedback!
,
Aug 4 2017
We have a couple of intermittents open on tests in this directory and a few more that got closed (mostly when we stopped running XP where the problem is worse, I think) [1]. In this specific case we seems to semi-consistently fail the test in debug presumably due to slow execution. [1] https://bugzilla.mozilla.org/buglist.cgi?o5=substring&o13=substring&list_id=13711468&o14=substring&o16=substring&f13=component&o2=substring&j11=OR&o15=substring&v5=intermittent&f12=product&o4=substring&o12=substring&short_desc_type=allwordssubstr&f14=alias&o17=substring&v2=intermittent&v13=hr-time&v16=hr-time&v4=intermittent&f10=OP&f19=CP&v15=hr-time&f1=OP&o3=substring&v6=intermittent&o7=substring&short_desc=hr-time%20intermittent&f0=OP&f8=CP&v3=intermittent&f18=CP&f15=short_desc&o6=substring&v7=intermittent&f9=CP&f4=alias&query_format=advanced&j1=OR&v17=hr-time&f3=component&f2=product&v12=hr-time&f11=OP&f5=short_desc&f17=cf_crash_signature&f6=status_whiteboard&v14=hr-time&f7=cf_crash_signature&f16=status_whiteboard
,
Aug 4 2017
,
Aug 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ee941915b28c1b2be7b1afb5f914eca919788e5f commit ee941915b28c1b2be7b1afb5f914eca919788e5f Author: Nicolas Pena <npm@chromium.org> Date: Tue Aug 08 14:47:21 2017 Fix flakiness in timeOrigin test This CL uses now() calls instead of arbitrary numbers to prevent flakiness in this test when ran on slower machines or when the test is interrupted by debugging. Bug: chromium:752574 Change-Id: I9b3610a44ca5199b8a906dbe9715d280bd3439c5 Reviewed-on: https://chromium-review.googlesource.com/602381 Reviewed-by: Timothy Dresser <tdresser@chromium.org> Reviewed-by: Kentaro Hara <haraken@chromium.org> Commit-Queue: Nicolás Peña <npm@chromium.org> Cr-Commit-Position: refs/heads/master@{#492618} [modify] https://crrev.com/ee941915b28c1b2be7b1afb5f914eca919788e5f/third_party/WebKit/LayoutTests/external/wpt/hr-time/timeOrigin.html
,
Aug 8 2017
This should fix the flakiness, james@ let me know if there are still issues with the test.
,
Aug 9 2017
Oops, failing on Firefox bots. Date.now() is always an integer, whereas the other values are decimal. It seems that on Firefox Date.now() can be rounded up, causing it to be greater than the sum of the two decimals (by a small amount).
,
Aug 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ed96173eafe86ff5832a30431e0d9853a7a2cd2e commit ed96173eafe86ff5832a30431e0d9853a7a2cd2e Author: Nicolas Pena <npm@chromium.org> Date: Fri Aug 11 02:15:43 2017 Allow Date.now() rounding errors in timeOrigin test Date.now() is an integer whereas other values are decimal values, so if they are too close, the rounding of Date.now() could cause asserts to fail. Thus, add a + 1 to account for these rounding issues. Bug: chromium:752574 Change-Id: Iba8d3c8c6889a13fddb7ae779c6029d7664cb039 Reviewed-on: https://chromium-review.googlesource.com/608629 Reviewed-by: Kentaro Hara <haraken@chromium.org> Commit-Queue: Nicolás Peña <npm@chromium.org> Cr-Commit-Position: refs/heads/master@{#493647} [modify] https://crrev.com/ed96173eafe86ff5832a30431e0d9853a7a2cd2e/third_party/WebKit/LayoutTests/external/wpt/hr-time/timeOrigin.html
,
Aug 11 2017
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by rbyers@chromium.org
, Aug 4 2017