Layout Test external/wpt/fetch/api/abort/serviceworker-intercepted.https.html is flaky |
|||||||||||
Issue descriptionThe following layout test is flaky on all platforms external/wpt/fetch/api/abort/serviceworker-intercepted.https.html virtual/outofblink-cors/external/wpt/fetch/api/abort/serviceworker-intercepted.https.html Probable cause: I have no idea.
,
Jun 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/215330d2d03c22d971a0caedfe4a1a9b4ecd8d15 commit 215330d2d03c22d971a0caedfe4a1a9b4ecd8d15 Author: Greg Thompson <grt@chromium.org> Date: Wed Jun 27 10:33:24 2018 Mark serviceworker-intercepted.https.html tests as flaky. BUG= 856969 TBR=grt@chromium.org NOTRY=true Change-Id: I1864afbdfa1a71f4d6acdcba9ca9873659cc5ddb Reviewed-on: https://chromium-review.googlesource.com/1116779 Reviewed-by: Greg Thompson <grt@chromium.org> Commit-Queue: Greg Thompson <grt@chromium.org> Cr-Commit-Position: refs/heads/master@{#570724} [modify] https://crrev.com/215330d2d03c22d971a0caedfe4a1a9b4ecd8d15/third_party/WebKit/LayoutTests/TestExpectations
,
Jun 28 2018
Started looking at this. It's only flaky on Debug. The flakes started around June 24 around r569915. The only service worker or fetch related change I spot around that time is Leon's https://chromium-review.googlesource.com/1112904 which seems unlikely here.
,
Jun 28 2018
A local run on a Debug build can sometimes hit the timeout, when the abort/ directory is run together (not sure if it's necessary). $ third_party/WebKit/Tools/Scripts/run-webkit-tests --target=Debag --iterations=100 external/wpt/fetch/api/abort/ [56/600] external/wpt/fetch/api/abort/serviceworker-intercepted.https.html failed unexpectedly (test timed out) [68/600] external/wpt/fetch/api/abort/serviceworker-intercepted.https.html failed unexpectedly (test timed out) [74/600] external/wpt/fetch/api/abort/serviceworker-intercepted.https.html failed unexpectedly (test timed out)
,
Jun 28 2018
OK I can get timeouts just running this one test: $ third_party/WebKit/Tools/Scripts/run-webkit-tests --target=Debag --iterations=100 external/wpt/fetch/api/abort/serviceworker-intercepted.https.html [4/100] external/wpt/fetch/api/abort/serviceworker-intercepted.https.html failed unexpectedly (test timed out) [14/100] external/wpt/fetch/api/abort/serviceworker-intercepted.https.html failed unexpectedly (test timed out)
,
Jun 28 2018
Hum, this does seem caused by https://chromium-review.googlesource.com/1112904 after all. When that patch is reverted there are no timeouts. Leon, can you take a look? I guess we're getting OnDetached() called in some cases where we still need to listen for OnScriptLoad() or else the job gets stucked?
,
Jun 28 2018
It's possibly not P1 since it only flakes on Debug, but it's a bit scary. If the job queue gets stuck only a browser reboot can fix that.
,
Jun 28 2018
,
Jun 28 2018
,
Jun 28 2018
c#6 is right, we got OnDetached() due to the connection error handler of "mojom::EmbeddedWorkerInstanceClientPtr client_;", seems after that SWRegisterJob still needs to wait for OnScriptLoad() to continue. One thing is confusing to me: as OnScriptLoaded()'s owner interface EmbeddedWorkerInstanceHost is associated on the message pipe of mojom::EmbeddedWorkerInstanceClient, if the message pipe got broken, EmbeddedWorkerInstanceHost::OnScriptLoaded() should also have no chance to be dispatched after that.
,
Jun 29 2018
Could the connection possibly be re-established later? not sure. Or client_ was destructed, but renderer still has a pointer to the host and can send messages to it?
,
Jun 29 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1c81791b45c09b52614f85a7016425b71da32c4d commit 1c81791b45c09b52614f85a7016425b71da32c4d Author: Han Leon <leon.han@intel.com> Date: Fri Jun 29 07:11:10 2018 [ServiceWorker] Stop observing EmbeddedWorkerInstance once it's gone away This CL adds a function OnDestroyed() into EmbeddedWorkerInstance::Listener interface, so that ServiceWorkerRegisterJob can stop observing EmbeddedWorkerInstance once it's gone away. BUG= 856969 , 856330 Change-Id: I3acc9143f9ea2f8f24ab0aac465f29f97fdf9906 Reviewed-on: https://chromium-review.googlesource.com/1118006 Commit-Queue: Han Leon <leon.han@intel.com> Reviewed-by: Matt Falkenhagen <falken@chromium.org> Cr-Commit-Position: refs/heads/master@{#571417} [modify] https://crrev.com/1c81791b45c09b52614f85a7016425b71da32c4d/content/browser/service_worker/embedded_worker_instance.cc [modify] https://crrev.com/1c81791b45c09b52614f85a7016425b71da32c4d/content/browser/service_worker/embedded_worker_instance.h [modify] https://crrev.com/1c81791b45c09b52614f85a7016425b71da32c4d/content/browser/service_worker/service_worker_register_job.cc [modify] https://crrev.com/1c81791b45c09b52614f85a7016425b71da32c4d/content/browser/service_worker/service_worker_register_job.h [modify] https://crrev.com/1c81791b45c09b52614f85a7016425b71da32c4d/third_party/WebKit/LayoutTests/TestExpectations
,
Jul 3
From the dashboard https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_tests&tests=external%2Fwpt%2Fservice-workers%2Fservice-worker%2Fabout-blank-replacement.https.html%2Cexternal%2Fwpt%2Ffetch%2Fapi%2Fabort%2Fserviceworker-intercepted.https.html, we can see the flake has disappeared. Although M68 has no such a flake, we know the flake will also appear there once we merge the fix for issue 854063 to M68, so, here I want to request to merge the commit at c#12 to M68, too, together with that CL from issue 854063, just as https://bugs.chromium.org/p/chromium/issues/detail?id=854063#c13 said. Thanks.
,
Jul 3
This bug requires manual review: M68 has already been promoted to the beta branch, so this requires manual review Please contact the milestone owner if you have questions. Owners: cmasso@(Android), kariahda@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 3
How safe is this merge overall? How common is this scenario where this crash can be triggered? Why should we merge this if we're not seeing in M68?
,
Jul 4
This commit has a small size and I'm confident it is safe enough. Actually this issue is a regression caused by the commit 26f26cd82a8159dff52fdef10b0a7ebb84c48040, which fixed issue 854063. Because that commit has not been merged into M68, we're not seeing this issue's problem in M68, but M68 does have the problem of issue 854063 now, so I think we should merge the 2 commits together into M68, then everything would be OK there. Thanks.
,
Jul 6
Ok thanks. Approving merge for M68, branch:3440. Can you please ensure that issue 854063 is also merged. It has been approved.
,
Jul 7
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/baf2459c1cc305b127d41424a86d948e731ee233 commit baf2459c1cc305b127d41424a86d948e731ee233 Author: Han Leon <leon.han@intel.com> Date: Sat Jul 07 06:20:28 2018 [M68] [ServiceWorker] Stop observing EmbeddedWorkerInstance once it's gone away This CL is a combination of the 2 previous commits on trunk, both of them aim to avoid use-after-free behavior on the raw pointer of an EmbeddedWorkerInstance. After the 1st CL landed on trunk, it turns out that it did not fix the problem completely and caused regressions like issue 856969 , then we have the 2nd CL give a complete solution. The 1st one: " [ServiceWorker] Avoid using stale pointer of EmbeddedWorkerInstance We're saving the raw pointer of an EmbeddedWorkerInstance in a base::ScopedObserver member of ServiceWorkerRegisterJob, but, the EmbeddedWorkerInstance is not guaranteed to outlive the ServiceWorkerRegisterJob, it may get destroyed and lead to a situation that ServiceWorkerRegisterJob holds a stale pointer, which could lead to crash. This CL lets ServiceWorkerRegisterJob listen OnDetached() from the EmbeddedWorkerInstance so it can be aware that the EmbeddedWorkerInstanc pointer is going to be stale and should be removed from the source list of base::ScopedObserver. This fix should be safe enough now, but in longer term we'll try to remove EmbeddedWorkerInstance::Listener interface and let EmbeddedWorkerInstance talk directly to its owner ServiceWorkerVersion, then let ServiceWorkerRegisterJob observe the ServiceWorkerVersion it's manipulating to get OnScriptLoaded() notification. BUG=854063,855394 Change-Id: I9c5a46beda1aafff32a86b9055be2b53d50fda97 Reviewed-on: https://chromium-review.googlesource.com/1112904 Commit-Queue: Han Leon <leon.han@intel.com> Reviewed-by: Matt Falkenhagen <falken@chromium.org> Cr-Commit-Position: refs/heads/master@{#569906} (cherry picked from commit 26f26cd82a8159dff52fdef10b0a7ebb84c48040) " The 2nd one: " [ServiceWorker] Stop observing EmbeddedWorkerInstance once it's gone away This CL adds a function OnDestroyed() into EmbeddedWorkerInstance::Listener interface, so that ServiceWorkerRegisterJob can stop observing EmbeddedWorkerInstance once it's gone away. BUG= 856969 , 856330 Change-Id: I3acc9143f9ea2f8f24ab0aac465f29f97fdf9906 Reviewed-on: https://chromium-review.googlesource.com/1118006 Commit-Queue: Han Leon <leon.han@intel.com> Reviewed-by: Matt Falkenhagen <falken@chromium.org> Cr-Commit-Position: refs/heads/master@{#571417} (cherry picked from commit 1c81791b45c09b52614f85a7016425b71da32c4d) " BUG=854063, 856969 Change-Id: I91c68d8edba41f6f7725903288fd12ec7574c538 Reviewed-on: https://chromium-review.googlesource.com/1128469 Reviewed-by: Matt Falkenhagen <falken@chromium.org> Cr-Commit-Position: refs/branch-heads/3440@{#608} Cr-Branched-From: 010ddcfda246975d194964ccf20038ebbdec6084-refs/heads/master@{#561733} [modify] https://crrev.com/baf2459c1cc305b127d41424a86d948e731ee233/content/browser/service_worker/embedded_worker_instance.cc [modify] https://crrev.com/baf2459c1cc305b127d41424a86d948e731ee233/content/browser/service_worker/embedded_worker_instance.h [modify] https://crrev.com/baf2459c1cc305b127d41424a86d948e731ee233/content/browser/service_worker/service_worker_register_job.cc [modify] https://crrev.com/baf2459c1cc305b127d41424a86d948e731ee233/content/browser/service_worker/service_worker_register_job.h
,
Jul 9
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by grt@chromium.org
, Jun 27 2018