New issue
Advanced search Search tips

Issue 848256 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 11
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Terminate shared workers when the network service crashes

Project Member Reported by falken@chromium.org, May 31 2018

Issue description

Support auto-reconnect for the networkservice-backed factory sent by SharedWorkerHost.

SharedWorkerHost should observe disconnection from the the network service and auto-reconnect, like RenderFrameHostImpl does.
 

Comment 1 by mmenke@chromium.org, May 31 2018

Components: Blink>ServiceWorker Internals>Services>Network
Labels: Proj-Servicification

Comment 2 by falken@chromium.org, May 31 2018

Description: Show this description

Comment 3 by falken@chromium.org, May 31 2018

Components: -Blink>ServiceWorker Blink>Workers
Summary: Support auto-reconnect for the networkservice-backed factory sent by SharedWorkerHost (was: Support auto-reconnect for the networkservice-backed factory sent by ServiceWorkerHost)
Corrected my original description: ServiceWorker -> SharedWorker
Status: Started (was: Assigned)
Blocking: -839982
Status: Assigned (was: Started)
On second thought I suspect this doesn't block Canary and can be done later. I'll see if there are Canary blockers to take...

Comment 6 by dxie@google.com, Jun 7 2018

Labels: Hotlist-KnownIssue
Labels: WorkerBacklog
Labels: -Pri-3 M-71 Pri-2
I misunderstood the release plan and didn't realize out-of-process NetworkService is shipping already.  So this is higher priority.


See also  issue 884007  for service workers.
Labels: -Pri-2 Pri-1
Network service crashes are also rare; so in this case would it be simpler to just shut down the shared workers?
Hmm that could be a good idea and would work even better for service workers which are designed to be ephemeral.

Stopping the shared workers could be surprising and require reload to fix (so new SharedWorker() is called again), whereas stopping service workers already happens in normal operation and doesn't require reload.
Project Member

Comment 12 by bugdroid1@chromium.org, Sep 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2075b0d44c960dd5ed208e710fd79eb8ab35d771

commit 2075b0d44c960dd5ed208e710fd79eb8ab35d771
Author: Makoto Shimazu <shimazu@chromium.org>
Date: Fri Sep 21 09:04:22 2018

Add tests for fetch() in a service/shared worker after NetworkService crashes

Bug:  884007 ,  848256 
Change-Id: I3dbeab15a52fc296058b1962364edb19c95d74a0
Reviewed-on: https://chromium-review.googlesource.com/1235261
Commit-Queue: Makoto Shimazu <shimazu@chromium.org>
Reviewed-by: Matt Falkenhagen <falken@chromium.org>
Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#593129}
[modify] https://crrev.com/2075b0d44c960dd5ed208e710fd79eb8ab35d771/content/browser/network_service_restart_browsertest.cc

The fix for this will be the same as  issue 884007 : when the network service crashes, just shut down the SharedWorker, which would be analogous to the renderer process crashing.
Project Member

Comment 14 by bugdroid1@chromium.org, Oct 5

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/24c77b16ea86fa02df8b8b0eb45a120cf02dc879

commit 24c77b16ea86fa02df8b8b0eb45a120cf02dc879
Author: Matt Falkenhagen <falken@chromium.org>
Date: Fri Oct 05 05:17:48 2018

shared worker: Self-terminate when network service crashes.

This treats the network service crashing like the worker's process
crashing, and prevents clients getting stuck with a broken worker.

Bug:  848256 
Change-Id: Iab8864d673da04cea062f65644e2ee10f74de4da
Reviewed-on: https://chromium-review.googlesource.com/c/1260903
Reviewed-by: Hiroki Nakagawa <nhiroki@chromium.org>
Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
Commit-Queue: Matt Falkenhagen <falken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#596999}
[modify] https://crrev.com/24c77b16ea86fa02df8b8b0eb45a120cf02dc879/content/browser/network_service_restart_browsertest.cc
[modify] https://crrev.com/24c77b16ea86fa02df8b8b0eb45a120cf02dc879/content/browser/shared_worker/shared_worker_host.cc
[modify] https://crrev.com/24c77b16ea86fa02df8b8b0eb45a120cf02dc879/content/browser/shared_worker/shared_worker_service_impl.cc
[modify] https://crrev.com/24c77b16ea86fa02df8b8b0eb45a120cf02dc879/content/browser/shared_worker/shared_worker_service_impl.h
[modify] https://crrev.com/24c77b16ea86fa02df8b8b0eb45a120cf02dc879/content/renderer/shared_worker/embedded_shared_worker_stub.cc
[modify] https://crrev.com/24c77b16ea86fa02df8b8b0eb45a120cf02dc879/content/renderer/shared_worker/embedded_shared_worker_stub.h

Status: Fixed (was: Assigned)
Summary: Terminate shared workers when the network service crashes (was: Support auto-reconnect for the networkservice-backed factory sent by SharedWorkerHost)
Project Member

Comment 16 by bugdroid1@chromium.org, Oct 5

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c0f7c620bf6690d257bca4e094e81fe7096f1a83

commit c0f7c620bf6690d257bca4e094e81fe7096f1a83
Author: Findit <findit-for-me@appspot.gserviceaccount.com>
Date: Fri Oct 05 07:14:23 2018

Revert "shared worker: Self-terminate when network service crashes."

This reverts commit 24c77b16ea86fa02df8b8b0eb45a120cf02dc879.

Reason for revert:

Findit (https://goo.gl/kROfz5) identified CL at revision 596999 as the
culprit for flakes in the build cycles as shown on:
https://findit-for-me.appspot.com/waterfall/flake/flake-culprit?key=ag9zfmZpbmRpdC1mb3ItbWVyQwsSDEZsYWtlQ3VscHJpdCIxY2hyb21pdW0vMjRjNzdiMTZlYTg2ZmEwMmRmOGI4YjBlYjQ1YTEyMGNmMDJkYzg3OQw

Sample Failed Build: https://ci.chromium.org/buildbot/chromium.chromiumos/linux-chromeos-rel/14262

Sample Failed Step: single_process_mash_content_browsertests

Sample Flaky Test: NetworkServiceRestartBrowserTest.SharedWorker

Original change's description:
> shared worker: Self-terminate when network service crashes.
> 
> This treats the network service crashing like the worker's process
> crashing, and prevents clients getting stuck with a broken worker.
> 
> Bug:  848256 
> Change-Id: Iab8864d673da04cea062f65644e2ee10f74de4da
> Reviewed-on: https://chromium-review.googlesource.com/c/1260903
> Reviewed-by: Hiroki Nakagawa <nhiroki@chromium.org>
> Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
> Commit-Queue: Matt Falkenhagen <falken@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#596999}

Change-Id: I2bf5229056da67f539f4f89ed7fc92578edac567
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  848256 ,  892535 
Reviewed-on: https://chromium-review.googlesource.com/c/1263885
Cr-Commit-Position: refs/heads/master@{#597020}
[modify] https://crrev.com/c0f7c620bf6690d257bca4e094e81fe7096f1a83/content/browser/network_service_restart_browsertest.cc
[modify] https://crrev.com/c0f7c620bf6690d257bca4e094e81fe7096f1a83/content/browser/shared_worker/shared_worker_host.cc
[modify] https://crrev.com/c0f7c620bf6690d257bca4e094e81fe7096f1a83/content/browser/shared_worker/shared_worker_service_impl.cc
[modify] https://crrev.com/c0f7c620bf6690d257bca4e094e81fe7096f1a83/content/browser/shared_worker/shared_worker_service_impl.h
[modify] https://crrev.com/c0f7c620bf6690d257bca4e094e81fe7096f1a83/content/renderer/shared_worker/embedded_shared_worker_stub.cc
[modify] https://crrev.com/c0f7c620bf6690d257bca4e094e81fe7096f1a83/content/renderer/shared_worker/embedded_shared_worker_stub.h

Cause of the revert above: 

../../content/browser/network_service_restart_browsertest.cc:828: Failure
Expected equality of these values:
  1u
    Which is: 1
  service->worker_hosts_.size()
    Which is: 0
Stack trace:
#0 0x00000142343c testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop()
#1 0x000001422e09 testing::internal::AssertHelper::operator=()
#2 0x00000092d517 content::NetworkServiceRestartBrowserTest_SharedWorker_Test::RunTestOnMainThread()
#3 0x0000025a81cb content::BrowserTestBase::ProxyRunTestOnMainThreadLoop()
#4 0x0000026149d8 content::ShellBrowserMainParts::PreMainMessageLoopRun()
#5 0x000001fcd8e1 content::BrowserMainLoop::PreMainMessageLoopRun()
#6 0x000002357317 content::StartupTaskRunner::RunAllTasksNow()
#7 0x000001fcc25d content::BrowserMainLoop::CreateStartupTasks()
#8 0x000001fcfd73 content::BrowserMainRunnerImpl::Initialize()
#9 0x0000025fb606 ShellBrowserMain()
#10 0x0000025f1c92 content::ShellMainDelegate::RunProcess()
#11 0x000001efb48b content::ContentMainRunnerImpl::Run()
#12 0x000003c6c180 service_manager::Main()
#13 0x00000166bbd4 content::ContentMain()
#14 0x0000025a7df3 content::BrowserTestBase::SetUp()
Status: Assigned (was: Fixed)
Also a likely cause for the failures on Windows bots: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64/28211
Status: Started (was: Assigned)
 Issue 892535  has been merged into this issue.
Thanks for reverting msramek. FYI I'm relanding with a fix.
Project Member

Comment 22 by bugdroid1@chromium.org, Oct 5

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e672196491ded5b1484af692576c9e26d21f4af9

commit e672196491ded5b1484af692576c9e26d21f4af9
Author: Matt Falkenhagen <falken@chromium.org>
Date: Fri Oct 05 10:17:05 2018

Reland: shared worker: Self-terminate when network service crashes.

This treats the network service crashing like the worker's process
crashing, and prevents clients getting stuck with a broken worker.

Originally landed at: https://chromium-review.googlesource.com/c/1260903
It was reverted because the test flaked because the host was sometimes
not yet created. The diff in the reland is to wait for the host
to be created.

Bug:  848256 
Change-Id: Ic656f0bf5a320526cf6f7715399f8f71db05619b
Reviewed-on: https://chromium-review.googlesource.com/c/1264139
Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
Commit-Queue: Matt Falkenhagen <falken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#597063}
[modify] https://crrev.com/e672196491ded5b1484af692576c9e26d21f4af9/content/browser/network_service_restart_browsertest.cc
[modify] https://crrev.com/e672196491ded5b1484af692576c9e26d21f4af9/content/browser/shared_worker/shared_worker_host.cc
[modify] https://crrev.com/e672196491ded5b1484af692576c9e26d21f4af9/content/browser/shared_worker/shared_worker_service_impl.cc
[modify] https://crrev.com/e672196491ded5b1484af692576c9e26d21f4af9/content/browser/shared_worker/shared_worker_service_impl.h
[modify] https://crrev.com/e672196491ded5b1484af692576c9e26d21f4af9/content/renderer/shared_worker/embedded_shared_worker_stub.cc
[modify] https://crrev.com/e672196491ded5b1484af692576c9e26d21f4af9/content/renderer/shared_worker/embedded_shared_worker_stub.h

Can this be marked fixed now?
Status: Fixed (was: Started)
Yep

Sign in to add a comment