New issue
Advanced search Search tips

Issue 837389 link

Starred by 4 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Mac
Pri: 1
Type: Bug



Sign in to add a comment

[Chromium Servicification] Need better handling for when a "core" service process fails to start/initialize.

Project Member Reported by penny...@chromium.org, Apr 26 2018

Issue description

This example focuses on the new Network Process (NP), but the issue will affect any core service that is new or carved out of browser process.

On first run, or any time, that a core service child process is spawned, how do we want to handle a failed service startup?  (And what is the user experience?)

In the case of the NP (--enable-features=NetworkService), if the child process fails to be spawned or successfully initialized:
1) the browser UI remains visible and open.
2) because this service is "restartable", it appears an infinite loop of attempted child respawns is happening under the hood - constantly chewing system resources.
3) the visible browser does not appropriately shutdown due to critical failure... it just sits there with no networking under the hood. (Not ideal user experience, and possibly no appropriate UMA visibility for us back in-house?)

Even if the NP wasn't "restartable", I think the visible user experience would be the same, just without the hidden infinite failing spawn loop.

When core functionality like networking was part of the browser process, any failure to initialize would result in a browser shutdown.  I would expect to at least see that here.

We should have a cohesive strategy and reaction for all "core" services (is that indicated in the manifests?) that Chrome needs to have running.  We definitely need a failure path for the NP before we start experimenting on early channels.

**I think this is a cross-platform issue (though I'm only testing on Windows).  
Feel free to adjust all triage labels!  It might be a P1 relative to the NetworkService progress.








 

Comment 1 Deleted

Comment 2 by dxie@chromium.org, May 15 2018

Owner: jam@chromium.org
Status: Assigned (was: Available)
giving to jam@ for further investigation and reassign.
Labels: -Pri-2 Proj-Servicification-Canary Proj-Servicification Pri-1
Updating for dxie@ bug triage request.

I believe this is a blocker for Windows early channels.

Comment 4 by jam@chromium.org, May 30 2018

Labels: -Proj-Servicification-Canary
Owner: ----
Status: Available (was: Assigned)
Since we haven't seen this without sandbox changes, I'm going to remove the canary blocking label for now.
Cc: wfh@chromium.org
Tom and I think this can be a Win Beta blocker.  There apparently has been a change in a different bug to add a backoff to the infinite respawn spinning.

Note: this is not specific to adding a sandbox, but for any initialization failure in a "core" or "required" component/service.  Cross-platform.

It would be good to define as well how we would even see this happening via UMA/crash.  When testing on Canary, what stability indicators are we going to watch for to know a startup failure is happening?  It's not currently resulting in a crash or fatal shutdown, nor a StartSandboxedProcess histogram (because sandbox isn't enabled yet).  I'd recommend adding a plan in the Network Service Canary "things to watch for" doc.



Comment 6 by dxie@google.com, May 31 2018

Labels: Hotlist-KnownIssue
Components: -Internals>Services>Network

Sign in to add a comment