New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 661843 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug-Regression



Sign in to add a comment

Possible comparison to stale pointer in BrowserContext::NotifyWillBeDestroyed

Project Member Reported by marshall@chromium.org, Nov 3 2016

Issue description

Version: M55 and newer
OS: Windows 10 64-bit

What steps will reproduce the problem?
(1) Create and destroy multiple BrowserContext instances at the same time.

What is the expected output?
No assertions should be hit.

What do you see instead?
Hit DCHECK(!is_worker_ref_count_disabled_) in RenderProcessHostImpl::ForceReleaseWorkerRefCounts()

Please use labels and text to provide additional information.
The problem is that:

1. RenderProcessHostImpl keeps a raw BrowserContext pointer (browser_context_) and does not clear this pointer when the BrowserContext is deleted.

2. There is an asynchronous delay between deletion of the BrowserContext and deletion of the associated RenderProcessHostImpl. This means that RenderProcessHostImpl::GetBrowserContext() may return a stale (already deleted) pointer.

3. BrowserContext::NotifyWillBeDestroyed() may call RenderProcessHostImpl::ForceReleaseWorkerRefCounts() multiple times for the same RenderProcessHostImpl instance in cases where:

A. RenderProcessHostImpl::GetBrowserContext() returns a stale (already deleted) pointer, and;

B. The pointer address has already been reused for a new BrowserContext instance.


Detailed explanation:

ProfileImpl::~ProfileImpl() calls Profile::MaybeSendDestroyedNotification() which calls BrowserContext::NotifyWillBeDestroyed(). NotifyWillBeDestroyed() uses RenderProcessHost::AllHostsIterator() to iterate existing RenderProcessHostImpl instances and compares RenderProcessHostImpl::GetBrowserContext() to the BrowserContext instance being deleted (passed to NotifyWillBeDestroyed).

This bug can be triggered by creating and destroying multiple BrowserContext instances at the same time. When an old BrowserContext instance is deleted and a new BrowserContext instance is created the old instance's pointer address might be reused for the new BrowserContext instance. When the new BrowserContext is then deleted the RenderProcessHost::AllHostsIterator() list might still contain both the old RenderProcessHostImpl instance and the new RenderProcessHostImpl instance. This results in the following situation:

1. NotifyWillBeDestroyed() will call ForceReleaseWorkerRefCounts() for the old RenderProcessHostImpl when the old BrowserContext is deleted, and;

2. NotifyWillBeDestroyed() will call ForceReleaseWorkerRefCounts() again for the old RenderProcessHostImpl when the new BrowserContext is deleted.

#1 is correct and #2 is incorrect.

This problem can be avoided by clearing the |RenderProcessHostImpl::browser_context_| pointer when the BrowserContext calls ForceReleaseWorkerRefCounts() to signal its deletion. In that case when NotifyWillBeDestroyed() is called in #2 the old RenderProcessHostImpl instance will now return nullptr instead of returning the old (already deleted) BrowserContext pointer.

Here's the fix that works for me:

diff --git a/content/browser/renderer_host/render_process_host_impl.cc b/content/browser/renderer_host/render_process_host_impl.cc
index f952adc..e9998f3 100644
--- a/content/browser/renderer_host/render_process_host_impl.cc
+++ b/content/browser/renderer_host/render_process_host_impl.cc
@@ -1395,6 +1409,7 @@ void RenderProcessHostImpl::ForceReleaseWorkerRefCounts() {
   DCHECK_CURRENTLY_ON(BrowserThread::UI);
   DCHECK(!is_worker_ref_count_disabled_);
   is_worker_ref_count_disabled_ = true;
+  browser_context_ = nullptr;
   if (!worker_ref_count())
     return;
   service_worker_ref_count_ = 0;

 

Comment 1 by falken@chromium.org, Nov 28 2016

Cc: -falken@chromium.org
Components: -Content>Core Blink>Workers
Labels: -Pri-3 M-55 Pri-1
Owner: falken@chromium.org
Status: Assigned (was: Untriaged)
Sorry Marshall your email slipped past me.

Comment 2 by falken@chromium.org, Nov 29 2016

Labels: -Pri-1 Pri-2
I started https://codereview.chromium.org/2528233003/ but I'm no longer sure this is a problem with ForceReleaseWorkerRefCounts.

marshall@ can you expand on:
2. There is an asynchronous delay between deletion of the BrowserContext and deletion of the associated RenderProcessHostImpl.

The content layer assumes that BrowserContext outlives its RPHIs. Indeed the chrome embedder does some tricks to mostly ensure this assumption. As pointed out by horo@ on the codereview above, chrome's ProfileDestroyer does:

  // Generally, !hosts.empty() means that there is a leak in a render process
  // host that MUST BE FIXED!!!
  //
  // However, off-the-record profiles are destroyed before their
  // RenderProcessHosts in order to erase private data quickly, and
  // RenderProcessHostImpl::Release() avoids destroying RenderProcessHosts in
  // --single-process mode to avoid race conditions.
  DCHECK(hosts.empty() || profile->IsOffTheRecord() ||
      content::RenderProcessHost::run_renderer_in_process()) << \
      "Profile still has " << hosts.size() << " hosts";

And proceeds to delete the browser context if hosts is empty or else wait until hosts becomes empty.

So I think RPHI's browser_context_ should never be a stale pointer and setting it to null to avoid that case seems to just be working around a real underlying bug.
@#2: Thanks for pointing out the logic in ProfileDestroyer. Unfortunately that logic lives in the chrome/ layer, not the content/ layer. If indeed content is making the assumption that "BrowserContext outlives its RPHIs" then I think that should be guaranteed and/or enforced by the content layer. As content is currently designed there's a race condition between BrowserContext and RenderProcessHostImpl deletion. I think it's perfectly valid given the current design of content to fix the bug as described above, even if the chrome/ layer performs additional logic in an attempt to work around the existing race condition.

Sign in to add a comment