New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 728030 link

Starred by 7 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , Mac
Pri: 1
Type: Bug



Sign in to add a comment

ServiceWorker.FetchEvent.MainResource.Status ERROR_START_WORKER_FAILED with PlzNavigate

Project Member Reported by horo@chromium.org, May 31 2017

Issue description

The ratio of ERROR_START_WORKER_FAILED has increased from Win/Mac Canary from 60.0.3110.0.

Win: https://uma.googleplex.com/timeline_v2?sid=7031075b44cb2059537a237b9ebdd90d
Mac: https://uma.googleplex.com/timeline_v2?sid=ba26407cc67bfea58b9c56d382cfe876

 

Comment 1 by horo@chromium.org, Jun 2 2017

Status: Started (was: Assigned)
Now we can see the same spike in Dev which are recently updated from 60.0.3107.4 to 60.0.3112.7.

Win Dev: https://uma.googleplex.com/timeline_v2?sid=529e8c6fb32cb314d4a87977a519a2e7
Mac Dev: https://uma.googleplex.com/timeline_v2?sid=71b02e978a8a7e8c8313810872806cea


But according to the result of Win Canary, the error rate looks improved from 61.0.3116.0.

Win Canary: https://uma.googleplex.com/timeline_v2?sid=ae0d1fcb86e2f9e867271b668590c0c8
Mac Canary: https://uma.googleplex.com/timeline_v2?sid=ddd3f9c608e5459ddc466f6ab6b4e194

Comment 2 by horo@chromium.org, Jun 2 2017

Summary: ServiceWorker.FetchEvent.MainResource.Status ERROR_START_WORKER_FAILED is increasing on Win/Mac (was: ServiceWorker.StartWorker.Status ERROR_START_WORKER_FAILED is increasing on Win/Mac)

Comment 3 by horo@chromium.org, Jun 2 2017

Labels: Proj-PlzNavigate
Owner: clamy@chromium.org
Status: Assigned (was: Started)
I think this issue is caused by the experiment of PlzNavigate.

The negative impact of PlzNavigate on the error rate of ServiceWorker.StartWorker.StatusByPurpose_FETCH_MAIN_FRAME has increased from 60.0.3110 to 61.0.3115. 

There are two related patches in  issue 705318 .
- "PlzNavigate: implement process reuse for ServiceWorkers" (5d947f5a0c60d33fa16b6609978bd712a497cb00) landed in 60.0.3110.0.
- "Ensure the NTP ServiceWorker has the proper site URL" (a4262ab9315b24197a00cc6c942474144c1e4a7f) landed in 61.0.3116.0.


Win Canary UMAs
PlzNavigate |  OFF   |   ON   |
60.0.3100.* | 99.92% | 99.74% | https://uma.googleplex.com/p/chrome/variations/?sid=97652f135c106ad99214478081694d53
60.0.3101.* | 99.92% | 99.70% | https://uma.googleplex.com/p/chrome/variations/?sid=2d0e73a9ea0a5c29b648397886f75161
60.0.3102.* | 99.93% | 99.75% | https://uma.googleplex.com/p/chrome/variations/?sid=1bf58fd28b73f2bf67679e57f6a00f79
60.0.3103.* | 99.91% | 99.72% | https://uma.googleplex.com/p/chrome/variations/?sid=11f5bd5cf322226618b9c7487fd5eb57
60.0.3104.* | 99.93% | 99.82% | https://uma.googleplex.com/p/chrome/variations/?sid=214d819b870da4dffb3c9c86a2c9ba89
60.0.3105.* | 99.90% | 99.64% | https://uma.googleplex.com/p/chrome/variations/?sid=9b64d16b42d5e64c5e0488d07e481d18
60.0.3106.* | 99.85% | 99.57% | https://uma.googleplex.com/p/chrome/variations/?sid=4bebeafb7cf4cd36bdc3bb004a8297c4
60.0.3107.* | 99.92% | 99.68% | https://uma.googleplex.com/p/chrome/variations/?sid=5b36959eec939590d8c16f7499d1c1e2
60.0.3108.* | 99.92% | 99.74% | https://uma.googleplex.com/p/chrome/variations/?sid=bc9b30f3e69a34f77a2d613ed0dc57e8
60.0.3109.* | 99.93% | 99.70% | https://uma.googleplex.com/p/chrome/variations/?sid=de9b3a64fd2607c48e6fcec931c1c51d
60.0.3110.* | 99.92% | 98.55% | https://uma.googleplex.com/p/chrome/variations/?sid=ad13f71e3226d5041334eb9cedb6e84e
60.0.3111.* | 99.93% | 98.78% | https://uma.googleplex.com/p/chrome/variations/?sid=8328bc35412502c4eb3a679a438abcb4
60.0.3112.* | 99.90% | 99.13% | https://uma.googleplex.com/p/chrome/variations/?sid=8f093400f48a0f54b8801d2fe0f78d00
61.0.3113.* | 99.92% | 99.08% | https://uma.googleplex.com/p/chrome/variations/?sid=8e3d31b1ba3d93419f9560f45df1480c
61.0.3114.* | 99.92% | 99.36% | https://uma.googleplex.com/p/chrome/variations/?sid=febefe2b5dd799a1c3544cba15e9cb72
61.0.3115.* | 99.92% | 98.97% | https://uma.googleplex.com/p/chrome/variations/?sid=ec79eee939a9fb48be1984791af968f9
61.0.3116.* | 99.90% | 99.71% | https://uma.googleplex.com/p/chrome/variations/?sid=a56351feee000942b3bff8f749075ca2

Comment 4 by horo@chromium.org, Jun 2 2017

The percentage in the comment 3 is the ratio of SERVICE_WORKER_OK of ServiceWorker.StartWorker.StatusByPurpose_FETCH_MAIN_FRAME.

Comment 5 by horo@chromium.org, Jun 2 2017

Labels: OS-Android OS-Linux
Summary: ServiceWorker.FetchEvent.MainResource.Status ERROR_START_WORKER_FAILED with PlzNavigate (was: ServiceWorker.FetchEvent.MainResource.Status ERROR_START_WORKER_FAILED is increasing on Win/Mac)
The negative impact of PlzNavigate on Android looks terrible.
We don't have enough data of Android (>=3116).
So I want to keep this issue open.

Android Canary UMAs of SERVICE_WORKER_OK of ServiceWorker.StartWorker.StatusByPurpose_FETCH_MAIN_FRAME

PlzNavigate |  OFF   |   ON   |
60.0.3100.* | 99.84% | 98.41% | https://uma.googleplex.com/p/chrome/variations/?sid=da2adb76c8fa3fb43e551da243242a8a
60.0.3101.* | 99.77% | 98.11% | https://uma.googleplex.com/p/chrome/variations/?sid=c364091900c67b9ce0c91dfff6d2dcfe
60.0.3102.* | 99.76% | 98.10% | https://uma.googleplex.com/p/chrome/variations/?sid=c4c5b2de06a15287387e37cd0de9d0dd
60.0.3103.* | 99.77% | 98.07% | https://uma.googleplex.com/p/chrome/variations/?sid=ffc3a82d9825d7c9ff4f000631314494
60.0.3104.* | 99.63% | 97.67% | https://uma.googleplex.com/p/chrome/variations/?sid=c59b462f88a622e9c56b3d145f44774b
60.0.3105.* | Not enough data | https://uma.googleplex.com/p/chrome/variations/?sid=1470caf1ac0ffc6105f1513a0abf0fa8
60.0.3106.* | Not enough data | https://uma.googleplex.com/p/chrome/variations/?sid=575704a2c311af7742b81d842a4a529f
60.0.3107.* | 99.80% | 98.62% | https://uma.googleplex.com/p/chrome/variations/?sid=529109f97fab0eeecd6989391d61a2a9
60.0.3108.* | 99.63% | 99.01% | https://uma.googleplex.com/p/chrome/variations/?sid=82c959d57450b1d2d8f9f76cf628c937
60.0.3109.* | 99.83% | 96.79% | https://uma.googleplex.com/p/chrome/variations/?sid=c58c1028682356b092c4292a0a1d9b64
60.0.3110.* | Not enough data | https://uma.googleplex.com/p/chrome/variations/?sid=1645229c8c9fa5a690e39c0cae25e21d
60.0.3111.* | 99.71% | 85.91% | https://uma.googleplex.com/p/chrome/variations/?sid=e58a3ada420665fbaa96f5e497a5dffb
60.0.3112.* | 99.79% | 89.86% | https://uma.googleplex.com/p/chrome/variations/?sid=58a1ec8391de4139fbecb02c4f6307e2
61.0.3113.* | Not enough data | https://uma.googleplex.com/p/chrome/variations/?sid=2eef204e49aa2617d97efd42686a347b
61.0.3114.* | 99.69% | 90.41% | https://uma.googleplex.com/p/chrome/variations/?sid=9387b122f9a47a5c36efa3b66a2cd5d8
61.0.3115.* | 99.67% | 90.51% | https://uma.googleplex.com/p/chrome/variations/?sid=6c10ed00d4386f6d7a09ab98ff816059
61.0.3116.* | Not enough data | https://uma.googleplex.com/p/chrome/variations/?sid=4b2e99f62651694de41d5113d241db9d

Comment 6 by clamy@chromium.org, Jun 2 2017

Cc: creis@chromium.org jam@chromium.org nasko@chromium.org
Yes that seems pretty bad. My best guess is that we might be trying to put the SW in a process that's not able to instantiate it (ie exiting, unresponsive?). I did not put additional checks on the process we return, so it might be in a wrong state.

Charlie, Nasko, what are we supposed to check on the RPH to ensure it is in a state where it can start a SW?

Comment 7 by creis@chromium.org, Jun 2 2017

Cc: falken@chromium.org
Other than checking that it's not shutting down, I don't know what else would prevent a RPH from starting a ServiceWorker.  falken@, do you know?
I don't think service worker really does anything fancy with the RPH.

Only thing I can think of is, once you decide to use the RPH for the service worker, you must call IncrementServiceWorkerRefCount to ensure the RPH stays alive. Otherwise the browser process may thinks it's fine to kill the renderer process.




Comment 9 by horo@chromium.org, Jun 6 2017

I think we should not set REUSE_PENDING_OR_COMMITTED_SITE if |can_use_existing_process| is false in ServiceWorkerProcessManager::AllocateWorkerProcess().

|can_use_existing_process| is false if the service worker version has failed to start more than once.
https://chromium.googlesource.com/chromium/src/+blame/d99a9664c873fc38e226e33bc92da974fb852da3/content/browser/service_worker/embedded_worker_instance.cc#300
This logic prevent attempting to start the service worker in a process that's not able to instantiate.
Project Member

Comment 10 by bugdroid1@chromium.org, Jun 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/47dc328bb08ac9c66848eeaf68fc8e96a39975cb

commit 47dc328bb08ac9c66848eeaf68fc8e96a39975cb
Author: clamy <clamy@chromium.org>
Date: Wed Jun 07 00:53:39 2017

Only use SiteInstance process reuse for ServiceWorker with no failures

This CL ensures we will only attempt to use the SiteInstance process
reuse mechanism when creating a SiteInstance for a ServiceWorker when
the ServiceWorker has not failed to start. Otherwise, we will try to
start it in a brand new process.

BUG= 728030 

Review-Url: https://codereview.chromium.org/2925623004
Cr-Commit-Position: refs/heads/master@{#477493}

[modify] https://crrev.com/47dc328bb08ac9c66848eeaf68fc8e96a39975cb/content/browser/service_worker/service_worker_process_manager.cc
[modify] https://crrev.com/47dc328bb08ac9c66848eeaf68fc8e96a39975cb/content/browser/service_worker/service_worker_process_manager.h
[modify] https://crrev.com/47dc328bb08ac9c66848eeaf68fc8e96a39975cb/content/browser/service_worker/service_worker_process_manager_unittest.cc

Cc: altimin@chromium.org clamy@chromium.org shimazu@chromium.org
 Issue 725337  has been merged into this issue.
Project Member

Comment 12 by bugdroid1@chromium.org, Jun 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ada6aaf7955bf98ca25247619eeebad0e1a06466

commit ada6aaf7955bf98ca25247619eeebad0e1a06466
Author: Matt Falkenhagen <falken@chromium.org>
Date: Wed Jun 07 07:25:47 2017

service worker: Address comment/DCHECK nits from previous code review.

Follow-up to https://codereview.chromium.org/2925623004/

Bug:  728030 
Change-Id: I09f8b0f53e5ba79a3074e74ba80035c63848b4db
Reviewed-on: https://chromium-review.googlesource.com/526893
Reviewed-by: Kinuko Yasuda <kinuko@chromium.org>
Commit-Queue: Matt Falkenhagen <falken@chromium.org>
Cr-Commit-Position: refs/heads/master@{#477574}
[modify] https://crrev.com/ada6aaf7955bf98ca25247619eeebad0e1a06466/content/browser/service_worker/service_worker_process_manager.cc
[modify] https://crrev.com/ada6aaf7955bf98ca25247619eeebad0e1a06466/content/browser/service_worker/service_worker_process_manager_unittest.cc

Comment 13 by jam@chromium.org, Jun 8 2017

@horo or anyone else: looking at the UMA links posted in comments 1 - 5, these graphs all recovered after May 30 (i.e. after https://codereview.chromium.org/2898313003 landed).

Some links:
https://uma.googleplex.com/timeline_v2?sid=e874c2f437609e6b81a45de65c230f70
https://uma.googleplex.com/p/chrome/variations/?sid=80ab9098b4619ec1794ae62874a471bb

Is this still actually a bug?

Comment 14 by horo@chromium.org, Jun 8 2017

Status: Fixed (was: Assigned)
UMA of Windows recovered at 60.0.3116.
UMA of Andoird recovered at 60.0.3117.
I think the https://codereview.chromium.org/2898313003 which landed in 60.0.3116 is not related to Android where NTP doesn't use service worker.

I don't understand why its recovered on Android at 60.0.3117.
But anyway I can mark this bug as FIXED.

Windows Canary UMAs of SERVICE_WORKER_OK of ServiceWorker.StartWorker.StatusByPurpose_FETCH_MAIN_FRAME
PlzNavigate |  OFF   |   ON   |
60.0.3116.* | 99.91% | 99.62% | https://uma.googleplex.com/p/chrome/variations/?sid=8b6e811353db9443bebe993bd5997af9
60.0.3117.* | 99.92% | 99.62% | https://uma.googleplex.com/p/chrome/variations/?sid=dbb509218de0c6cce111318a4e63a2b5
60.0.3118.* | 99.90% | 99.63% | https://uma.googleplex.com/p/chrome/variations/?sid=1a1884ec413b04a59eeca35526b64243
60.0.3119.* | 99.93% | 99.72% | https://uma.googleplex.com/p/chrome/variations/?sid=eb1b4761bdd6cb1aac25e765298ad6c1


Android Canary UMAs of SERVICE_WORKER_OK of ServiceWorker.StartWorker.StatusByPurpose_FETCH_MAIN_FRAME
PlzNavigate |  OFF   |   ON   |
60.0.3116.* | 99.87% | 90.94% | https://uma.googleplex.com/p/chrome/variations/?sid=dcd92393aa18902bd46294a36d1c05c6
60.0.3117.* | 99.85% | 99.85% | https://uma.googleplex.com/p/chrome/variations/?sid=d838447555e3fbb65bd83d7708916890
60.0.3118.* | 99.76% | 99.43% | https://uma.googleplex.com/p/chrome/variations/?sid=18e31a1964c07ae7ea67c28b2fa973d3
60.0.3119.* | 99.42% | 99.37% | https://uma.googleplex.com/p/chrome/variations/?sid=71bc1ecc2a30ac6f4880a0a00cb4a36a

Comment 15 by jam@chromium.org, Jun 8 2017

Labels: Merge-Request-60
Project Member

Comment 16 by sheriffbot@chromium.org, Jun 8 2017

Labels: -Merge-Request-60 Hotlist-Merge-Approved Merge-Approved-60
Your change meets the bar and is auto-approved for M60. Please go ahead and merge the CL to branch 3112 manually. Please contact milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), josafat@(ChromeOS), bustamante@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 17 by bugdroid1@chromium.org, Jun 8 2017

Labels: -merge-approved-60 merge-merged-3112
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/acaa6e0bf4e2df67012aa1e055d6233e6920d958

commit acaa6e0bf4e2df67012aa1e055d6233e6920d958
Author: John Abd-El-Malek <jam@chromium.org>
Date: Thu Jun 08 17:14:06 2017

Only use SiteInstance process reuse for ServiceWorker with no failures

This CL ensures we will only attempt to use the SiteInstance process
reuse mechanism when creating a SiteInstance for a ServiceWorker when
the ServiceWorker has not failed to start. Otherwise, we will try to
start it in a brand new process.

BUG= 728030 

Review-Url: https://codereview.chromium.org/2925623004
Cr-Original-Commit-Position: refs/heads/master@{#477493}
Review-Url: https://codereview.chromium.org/2927973002 .
Cr-Commit-Position: refs/branch-heads/3112@{#255}
Cr-Branched-From: b6460e24cf59f429d69de255538d0fc7a425ccf9-refs/heads/master@{#474897}

[modify] https://crrev.com/acaa6e0bf4e2df67012aa1e055d6233e6920d958/content/browser/service_worker/service_worker_process_manager.cc
[modify] https://crrev.com/acaa6e0bf4e2df67012aa1e055d6233e6920d958/content/browser/service_worker/service_worker_process_manager.h
[modify] https://crrev.com/acaa6e0bf4e2df67012aa1e055d6233e6920d958/content/browser/service_worker/service_worker_process_manager_unittest.cc

 Issue 733933  has been merged into this issue.

Sign in to add a comment