New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 608464 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: May 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows , Mac
Pri: 1
Type: Bug



Sign in to add a comment

UMA stats show a number of renderer kills with reason bad_message::SWDH_POST_MESSAGE

Project Member Reported by nick@chromium.org, May 2 2016

Issue description

Starting sometime around April or March 2016, we are seeing UMA report that renderers are being killed with the reason bad_message::SWDH_POST_MESSAGE.

The rate of this kill seems to be relatively low, but it is happening on all channels currently. The trend can be seen by looking at the following timeline (Googlers only) for the UMA histogram Stability.BadMessageTerminated.Content:
  Stable channel history: https://goto.google.com/njtdp
  Canary channel history: https://goto.google.com/oemar

https://chromium.googlesource.com/chromium/src/+/0412b9e628ac0d7d5853b8b8dc41adfe6aae68f5, which added a new instance of bad_message::SWDH_POST_MESSAGE, is a possible culprit.
 

Comment 1 by nick@chromium.org, May 4 2016

Owner: nhiroki@chromium.org
Status: Assigned (was: Available)

Comment 2 by nick@chromium.org, May 4 2016

Labels: -Pri-3 Pri-1

Comment 3 by nick@chromium.org, May 5 2016

Labels: Stability-Sheriff-Desktop
+Stability-Sheriff-Desktop

This BadMessage kill accounts for 11K renderer terminations per day across all channels (such kills don't appear on crash/ but they do show up in the overall stability numbers). It looks like nhiroki and the patch reviewers might currently be out of the office. Can we find an owner to make progress on this?
Owner: mkwst@chromium.org
mkwst: Can you take a look?
Cc: -nhiroki@chromium.org mkwst@chromium.org
Owner: nhiroki@chromium.org
Status: Started (was: Assigned)
Thank you for reporting this! I'll take a look.
WebServiceWorkerImpl::postMessage() post a task to the main thread from a worker thread before sending an IPC message. I suspect that another IPC message to destroy ServiceWorkerHandle of the target worker or ServiceWorkerProviderHost of the sender can be sent during the thread hop.
Just to verify - nhiroki@ - you are still looking at this, correct?
Yes, I'm still looking at this.

Comment 9 by nick@chromium.org, May 12 2016

Now that bad_message always triggers a DumpWithoutCrashing, there are crash reports that might be helpful in understanding or reproing.

Here's a Mac instance of this: go/crash/7796340a00000000
Issue 611561 has been merged into this issue.
Project Member

Comment 11 by sheriffbot@chromium.org, May 13 2016

Labels: OS-Windows Fracas OS-Mac M-52
Users experienced this crash on the following builds:

Win Canary 52.0.2734.0 -  0.92 CPM, 18 reports, 16 clients (signature [Dump without crash] content::ServiceWorkerDispatcherHost::OnPostMessageToWorker)
Mac Canary 52.0.2734.0 -  1.69 CPM, 5 reports, 5 clients (signature [Dump without crash] content::ServiceWorkerDispatcherHost::OnPostMessageToWorker)

If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates.

- Go/Fracas
Project Member

Comment 13 by bugdroid1@chromium.org, May 13 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/66525f41fc6509604c61a8c0304141662cbdbe97

commit 66525f41fc6509604c61a8c0304141662cbdbe97
Author: nhiroki <nhiroki@chromium.org>
Date: Fri May 13 10:28:16 2016

ServiceWorker: Remove a bad IPC message check on postMessage()

This CL removes a bad IPC message check to confirm existence of the sender
provder host on postMessage() because the provider host may be destroyed before
this point when destruction of the provider overtakes postMessage() during
thread hopping on WebServiceWorkerImpl::postMessage().

BUG= 608464 

Review-Url: https://codereview.chromium.org/1977473003
Cr-Commit-Position: refs/heads/master@{#393495}

[modify] https://crrev.com/66525f41fc6509604c61a8c0304141662cbdbe97/content/browser/service_worker/service_worker_dispatcher_host.cc

Labels: -Stability-Sheriff-Desktop
Status: Fixed (was: Started)
I don't see any crashes after 52.0.2735.1, so I believe that this specific crash is fixed by the patch in c#13.  The UMA stats in c#1 are trending downwards but it looks like we won't have firm numbers for a few more days.

Tentatively marking fixed.


Labels: Merge-Approved-51
Thank you for confirming the latest crash numbers.
Could I have a chance to merge the patch on c#13 into M51?
Labels: -Merge-Approved-51 Merge-Request-51
(Fixed the label, sorry!)

Comment 18 by tin...@google.com, May 16 2016

Labels: -Merge-Request-51 Merge-Approved-51 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M51 (branch: 2704)
Project Member

Comment 19 by bugdroid1@chromium.org, May 17 2016

Labels: -merge-approved-51 merge-merged-2704
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/78c0da6b53aaf59d968f8eea8df5fdbff1cd0645

commit 78c0da6b53aaf59d968f8eea8df5fdbff1cd0645
Author: Hiroki Nakagawa <nhiroki@chromium.org>
Date: Tue May 17 01:32:09 2016

[Merge to M51] ServiceWorker: Remove a bad IPC message check on postMessage()

This CL removes a bad IPC message check to confirm existence of the sender
provder host on postMessage() because the provider host may be destroyed before
this point when destruction of the provider overtakes postMessage() during
thread hopping on WebServiceWorkerImpl::postMessage().

BUG= 608464 

Review-Url: https://codereview.chromium.org/1977473003
Cr-Commit-Position: refs/heads/master@{#393495}
(cherry picked from commit 66525f41fc6509604c61a8c0304141662cbdbe97)

Review URL: https://codereview.chromium.org/1980273002 .

Cr-Commit-Position: refs/branch-heads/2704@{#573}
Cr-Branched-From: 6e53600def8f60d8c632fadc70d7c1939ccea347-refs/heads/master@{#386251}

[modify] https://crrev.com/78c0da6b53aaf59d968f8eea8df5fdbff1cd0645/content/browser/service_worker/service_worker_dispatcher_host.cc

Sign in to add a comment