Layout Test external/wpt/webrtc/RTCPeerConnection-track-stats.https.html is flaky |
||||||||||
Issue descriptionThe following layout test is flaky on Win7 Tests (dbg)(1). external/wpt/webrtc/RTCPeerConnection-track-stats.https.html https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_layout_tests&tests=external%2Fwpt%2Fwebrtc%2FRTCPeerConnection-track-stats.https.html It appears to have been flaky on and off for some time. Assigning hbos@ based on recent changes to the file. If flake continues, I will disable the test.
,
Apr 6 2018
Here's the fix: https://chromium-review.googlesource.com/c/chromium/src/+/999477
,
Apr 6 2018
The flake is very rare btw so no need to disable the test in the meantime.
,
Apr 6 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/15d774ed7528de0846c95efb2273b5e5b8744ab9 commit 15d774ed7528de0846c95efb2273b5e5b8744ab9 Author: Henrik Boström <hbos@chromium.org> Date: Fri Apr 06 09:14:10 2018 Fix RTCPeerConnection-track-stats.https.html flake. The assert_object_equals would fail due to timestamps being different in rare runs where the getStats() cache was invalidated between the two calls. external/wpt/webrtc should make no assumptions about any cache. Tests are updated only to compare stats IDs, not individual metrics. Bug: 829401 Change-Id: I6fcc781a1827af69f47a5c4416dd5e6bae0c98b3 Reviewed-on: https://chromium-review.googlesource.com/999477 Reviewed-by: Harald Alvestrand <hta@chromium.org> Commit-Queue: Henrik Boström <hbos@chromium.org> Cr-Commit-Position: refs/heads/master@{#548721} [modify] https://crrev.com/15d774ed7528de0846c95efb2273b5e5b8744ab9/third_party/WebKit/LayoutTests/external/wpt/webrtc/RTCPeerConnection-track-stats.https.html
,
Apr 6 2018
,
Apr 6 2018
Based on https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win7%20Tests%20(dbg)(1)/68313 and https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win7%20Tests%20%28dbg%29%281%29/68314 , which both happened after 15d774ed, this is still timing out on Win7. Please take a look hbos :)
,
Apr 9 2018
,
Apr 9 2018
,
Apr 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/492e71000ed76bf3423957e15475198fcd0cfab9 commit 492e71000ed76bf3423957e15475198fcd0cfab9 Author: Vasilii Sukhanov <vasilii@chromium.org> Date: Mon Apr 09 14:20:16 2018 Mark RTCPeerConnection-track-stats.https.html as timing out on Win. TBR=hbos@chromium.org Bug: 829401 Change-Id: I112346b87535c6f78303da8e7b7ece905f4d6b0f Reviewed-on: https://chromium-review.googlesource.com/1001900 Reviewed-by: Vasilii Sukhanov <vasilii@chromium.org> Commit-Queue: Vasilii Sukhanov <vasilii@chromium.org> Cr-Commit-Position: refs/heads/master@{#549169} [modify] https://crrev.com/492e71000ed76bf3423957e15475198fcd0cfab9/third_party/WebKit/LayoutTests/TestExpectations
,
Apr 9 2018
It timing out is a different, unknown, issue than the flake I fixed (where it would fail due to finishing the test with a different result than -expected.txt). The logs give no hint about what is timing out.
,
Apr 10 2018
It also times out on Mac https://uberchromegw.corp.google.com/i/chromium.mac/builders/Mac10.11%20Tests/builds/25172
,
Apr 10 2018
deadbeef@, hta@: I don't know why this would time out, I really don't think the getStats() with selector API is timing out since it is just filtering existing results in a well-tested manor, but I still suspect the getStats() with selector tests are the ones that cause it to time out since these were recently added. https://cs.chromium.org/chromium/src/third_party/WebKit/LayoutTests/external/wpt/webrtc/RTCPeerConnection-track-stats.https.html?l=425 Other than using the new APIs, the difference between these tests and other tests is the use of: await doSignalingHandshake(caller, callee); await onIceConnectionStateCompleted(caller); Could you think of any reason why this might hang? Is it possible for pc.iceConnectionState to never be completed? Maybe it goes from 'connected' back to 'checking' or something? // Returns a promise that is resolved when pc.iceConnectionState changes to // 'completed'. This is when transport stats can be expected to have its // selectedCandidatePairId defined. async function onIceConnectionStateCompleted(pc) { let resolver = new Resolver(); pc.oniceconnectionstatechange = e => { if (pc.iceConnectionState == 'completed') resolver.resolve(); }; return resolver.promise; }
,
Apr 10 2018
I could speculatively change this to
if (pc.iceConnectionState == 'connected' ||
pc.iceConnectionState == 'completed')
and see if the flake goes away.
,
Apr 10 2018
,
Apr 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/55f4beb9157e45fa1ffe3ebfa7176b6cafbc33b6 commit 55f4beb9157e45fa1ffe3ebfa7176b6cafbc33b6 Author: Vasilii Sukhanov <vasilii@chromium.org> Date: Tue Apr 10 11:27:21 2018 Mark RTCPeerConnection-track-stats.https.html as timing out on all the platforms TBR=hbos@chromium.org Bug: 829401 Change-Id: I98b7d3149599e876d673a84718215521ddf6cfac Reviewed-on: https://chromium-review.googlesource.com/1004995 Reviewed-by: Vasilii Sukhanov <vasilii@chromium.org> Commit-Queue: Vasilii Sukhanov <vasilii@chromium.org> Cr-Commit-Position: refs/heads/master@{#549485} [modify] https://crrev.com/55f4beb9157e45fa1ffe3ebfa7176b6cafbc33b6/third_party/WebKit/LayoutTests/TestExpectations
,
Apr 10 2018
That's not it. It's very difficult to reproduce, I haven't been able to in content_shell but I have sometimes been able to with run-webkit-tests --iterations=50. It either finishes in ~1 second or it times out after ~18s.
,
Apr 10 2018
there used to be a different bug where the PC would never get from connected to completed state. Perhaps it still happens some part of the time? Idea: console.log the state transitions - see if there's a pattern before going into timeout.
,
Apr 10 2018
I don't know how to console.log because as far as I know that only shows up in content_shell, and I can only repro this with the run-webkit-tests script that eats up all the output. I've tried to trial-and-error only run a subset of the tests in the file by commenting out the other tests but I haven't pinpointed what is timing out. Now the script is misbehaving, I'll try rebooting
,
Apr 10 2018
if (pc.iceConnectionState == 'connected' ||
pc.iceConnectionState == 'completed')
Did not help.
,
Apr 10 2018
The fact that it only reproduces on multiple iterations (run-webkit-tests) but not when repeating just as many iterations a content_shell run, I wonder if this is tied to resources consumed by peer connections that are not garbage collected between runs.
,
Apr 10 2018
I'd advise against relying on the "completed" state, generally. It's sometimes gated by connectivity checks timing out, and we have pretty generous timeouts. I wouldn't think this would happen with loopback connections but you never know. Something I did notice in the log: 05:56:56.400 7852 [8852:9416:0406/055558.194:WARNING:peerconnection.cc(5360)] audio is not ready to use the remote candidate because the local or remote description is not set. I'm not sure if this is actually an issue, since the candidate should still be used after the local description is set... But you could comment out this condition and see if it's reproducible: https://cs.chromium.org/chromium/src/third_party/webrtc/pc/jseptransport2.cc?l=256&rcl=4da18e89bdee78df4478b66cdd0e6f6a38d61b4d Maybe there's a race condition, and "Promise.all" allows a candidate to sneak in between "localPc.setLocalDescription(offer)" and "remotePc.setRemoteDescription(offer)"? Doesn't seem like that should be possible though.
,
Apr 11 2018
Promise.all deliberately doesn't promise anything about sequence of execution, so it's entirely reasonable that sometimes localPc.setLocalDescription(offer) will be executed (and possibly cause candidates to be emitted) before remotePc.setRemoteDescription(offer) is completed. As deadbeef says, this should be OK in principle, early candidates should be saved, but I don't know that we have tests that verify that connections get set up correctly if all the candidates are early.
,
Apr 12 2018
In this case it's not that it is rejected but that it is timing out. I wish I knew how to get logs while running with the script, or that I could repro when running it in content_shell directly... While Promise.all does not guarantee order, the functions that are producing the promises are called in a specific order, which makes this non-racy in Chrome which queues operations on the webrtc signaling thread. WPT should not make any assumptions about what is executed in what order though. But whether localPc or remotePc is executed first does not matter - it does wait for promises to resolve before creating the answer.
,
Apr 12 2018
onIceConnectionStateCompleted is called last which wires up and listens for oniceconnectionstatechange... https://cs.chromium.org/chromium/src/third_party/WebKit/LayoutTests/external/wpt/webrtc/RTCPeerConnection-track-stats.https.html?sq=package:chromium&dr&l=575 Maybe the race is that if we're "lucky" the state has ALREADY reached 'completed' and as such there are no more events that will fire? I'll try...
,
Apr 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/84f1aaa8e300f56b37d84827443954696f12f4b7 commit 84f1aaa8e300f56b37d84827443954696f12f4b7 Author: Henrik Boström <hbos@chromium.org> Date: Fri Apr 13 10:48:23 2018 Fix race in track-stats.https.html test. This resolves the onIceConnectionStateComplete probmise when iceConnectionState reaches either 'connected' or 'completed' and fixes the race if it had already reached these states before the function was called. This fix is speculative. I am no longer able to repro the TIMEOUT locally with or without the fix. Bug: 829401 Change-Id: I1dec90250d640d93498c59a932ab5e84a3b96f15 Reviewed-on: https://chromium-review.googlesource.com/1012029 Reviewed-by: Harald Alvestrand <hta@chromium.org> Commit-Queue: Henrik Boström <hbos@chromium.org> Cr-Commit-Position: refs/heads/master@{#550580} [modify] https://crrev.com/84f1aaa8e300f56b37d84827443954696f12f4b7/third_party/WebKit/LayoutTests/TestExpectations [modify] https://crrev.com/84f1aaa8e300f56b37d84827443954696f12f4b7/third_party/WebKit/LayoutTests/external/wpt/webrtc/RTCPeerConnection-track-stats.https.html
,
Apr 13 2018
This is not verified, but marking as Fixed. If it keeps flaking I'm sure this or another bug will be reopened.
,
Apr 13 2018
hbos@ - thank you so much for your attempt to get this flake squashed forever! It is much appreciated :).
,
Apr 17 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/84f1aaa8e300f56b37d84827443954696f12f4b7 commit 84f1aaa8e300f56b37d84827443954696f12f4b7 Author: Henrik Boström <hbos@chromium.org> Date: Fri Apr 13 10:48:23 2018 Fix race in track-stats.https.html test. This resolves the onIceConnectionStateComplete probmise when iceConnectionState reaches either 'connected' or 'completed' and fixes the race if it had already reached these states before the function was called. This fix is speculative. I am no longer able to repro the TIMEOUT locally with or without the fix. Bug: 829401 Change-Id: I1dec90250d640d93498c59a932ab5e84a3b96f15 Reviewed-on: https://chromium-review.googlesource.com/1012029 Reviewed-by: Harald Alvestrand <hta@chromium.org> Commit-Queue: Henrik Boström <hbos@chromium.org> Cr-Commit-Position: refs/heads/master@{#550580} [modify] https://crrev.com/84f1aaa8e300f56b37d84827443954696f12f4b7/third_party/WebKit/LayoutTests/TestExpectations [modify] https://crrev.com/84f1aaa8e300f56b37d84827443954696f12f4b7/third_party/WebKit/LayoutTests/external/wpt/webrtc/RTCPeerConnection-track-stats.https.html
,
Jun 7 2018
[bulk-edit: disregard if N/A] Can the owner please set milestone to this bug if applicable?
,
Jun 7 2018
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by hbos@chromium.org
, Apr 6 2018