New issue
Advanced search Search tips

Issue 807013 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 25
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , iOS , Chrome , Mac , Fuchsia
Pri: 1
Type: Bug

Blocked on:
issue 829122



Sign in to add a comment

Memory leak when combining PostTaskAndReply with CancellableTaskTracker

Project Member Reported by erikc...@chromium.org, Jan 29 2018

Issue description

Out of Process Heap Profiling is a project that, for a very small set users, records heap allocations and sends back anonymized heap dumps. We've obtained a heap dump that suggests that there's a potential memory leak in favicon::ContentFaviconDriver::DidFinishNavigation. We do not have repro steps. 

I've attached the symbolized trace. For instructions on how to use it, see https://chromium.googlesource.com/chromium/src/+/lkcr/docs/memory-infra/heap_profiler.md.

In this heap dump, there are ~8.7 million calls to malloc/new from favicon::ContentFaviconDriver::DidFinishNavigation without corresponding calls to free/delete. These calls consume ~605MB of memory. 

I've also attached a screenshot of the callstack that results in these allocations. 

A brief glance at the code suggests that we perhaps store some data for every navigation which is never cleared. If a user encounters a site that pathologically performs navigations/or mucks with the history stack, perhaps this will cause a large number of never-destroyed objects?

Over to sky@, an OWNER of components/favicon.
 
Screen Shot 2018-01-29 at 4.19.21 PM.png
275 KB View Download
trace-b6bfbc97163289c3.gz
39.8 KB Download
Cc: -pkotw...@chromium.org sky@chromium.org
Owner: pkotw...@chromium.org
Assigning to myself. I am assigned to work on non-favicon stuff so this is 120% time work. I am however likely less swamped than sky@

Comment 2 by mastiz@chromium.org, Jan 30 2018

There's no dynamic memory allocation happening in ContentFaviconDriver::DidFinishNavigation() so I suspect the problem, if at all, exists somewhere else.

Is it possible that the heap dump is not detailed enough to break down the call to FetchFavicon? 
If you open the hump dump, you can dive in and see every call to new. In this case, I cut off the screenshot at the top-level function that was responsible for all the dynamic allocations. If you keep diving in [screenshot attached], you'll see that history::HistoryService::GetFaviconsForURL has 1.8 million calls to new with no corresponding delete. That function calls base::CancelableTaskTracker::PostTaskAndReply
, which has 2.4 million calls to new with no corresponding delete, etc.
  
Screen Shot 2018-01-30 at 9.20.31 AM.png
76.3 KB View Download
Screen Shot 2018-01-30 at 9.22.39 AM.png
120 KB View Download

Comment 4 by mastiz@chromium.org, Jan 30 2018

All allocated memory seems to be quickly bound (with ownership) to callbacks. Can this be the result of PostTaskAndReply() never actually producing a reply?

I came across this related comment: https://cs.chromium.org/chromium/src/base/threading/post_task_and_reply_impl.cc?l=85&rcl=b2308cc557305c41f78d76f8f182f2a4ba50cade
I think you're right. Here's what I think is happening:

The heap dump shows that a call to "new" from history::HistoryService::GetFaviconsForURL is being leaked. This is   

"""
std::vector<favicon_base::FaviconRawBitmapResult>* results =
      new std::vector<favicon_base::FaviconRawBitmapResult>();
"""

Ownership is passed to the callback, which makes a copy of the vector and passes it on (1): 

https://cs.chromium.org/chromium/src/components/history/core/browser/history_service.cc?type=cs&sq=package:chromium&l=69

This is posted onto a CancelableTaskTracker (2): https://cs.chromium.org/chromium/src/components/favicon/core/favicon_handler.cc?type=cs&q=FaviconHandler::FetchFavicon&sq=package:chromium&l=218

The lifetime of this queue is tied to FaviconHandler, https://cs.chromium.org/chromium/src/components/favicon/core/favicon_handler.h?type=cs&sq=package:chromium&l=293, whose lifetime in turn is scoped to a UserData key on the WebContents: https://cs.chromium.org/chromium/src/components/favicon/content/content_favicon_driver.cc?l=50

So if the FaviconHandler is destroyed before the callback is called...I believe there will be a lot of memory leakage. 

This theory is further backed up by the fact that the heap dump also shows 1.2 million leaked calls to new from base::internal::PostTaskAndReplyImpl::PostTaskAndReply. This corresponds pretty clearly to:
"""
  PostTaskAndReplyRelay* relay =
      new PostTaskAndReplyRelay(from_here, std::move(task), std::move(reply));
"""

Asides:
1) By changing to non-const references and using std::move, we can save a copy here.
2) FaviconHandler::FetchFavicon passes base::Unretained(this) as a parameter to a callback. This is unsafe and has the potential to cause UAF.
Cc: robliao@chromium.org gab@chromium.org fdoray@chromium.org
+ task scheduler folks. This looks like a really bad leak in the wild. Is there really no way to delete the reply and ensure that any objects owned by the callback are destroyed?
Owner: fdoray@chromium.org
fdoray says he'll take a look.

Comment 8 by sky@chromium.org, Jan 30 2018

mastiz, excellent find!
One easy solution is to change the code to use move semantics (as Erik mentions above), but other places may be impacted by CancelableTaskTracker, so that should be fixed too.

Comment 9 by gab@chromium.org, Feb 1 2018

Oh wow, excellent find indeed!

I remember vaguely trying to fix that a while ago, but since it only mattered for shutdown it was decided the complexity was not worth it (I forget if perhaps we didn't have cancellation back then or it wasn't widely used, but runtime leaks caused by it were not something that was considered/mentioned at the time). Attempts documented in  issue 371974 .

Happy to help brainstorm ideas to fix this and review.

Cheers,
Gab
Components: Internals>TaskScheduler

Comment 11 by ssid@chromium.org, Feb 15 2018

Cc: ssid@chromium.org
I see a very similar leak from CancelableTaskTracker::NewTrackedTaskId(). I have attached stack trace. Can someone tell me if this is the same bug?
I guess the tasks are never cleared as well. I also see the allocations from URLIndexPrivateData::ScheduleUpdateRecentVisits() of tasks that are not cleared.
urlIndex_cancellable_tasks.png
73.7 KB View Download
Labels: -Pri-3 M-66 Pri-1
Raising priority, given magnitude of frequency of issues.

@ssid - having trouble following the screenshot - is this a bottom-up view? If so, what is actually being leaked in NewTraskedTaskId? There are several calls to new, with different ownership patterns. ownership of "untrack_and_delete_flag_runner" is being passed to "is_canceled_cb", whereas ownership of "flag" is implicitly being passed to "untrack_and_delete_flag". 

If both are leaking, that suggests that "queued_history_db_tasks_" [or some other queue of tasks] is never being cleared.

If only the latter is leaking, that suggests an issue with how the deletion/cleanup tasks are being run. [or not run].

Regardless, it seems like a different leak, so we should probably file a different bug.

Comment 13 by gab@chromium.org, Feb 16 2018

Status: Started (was: Assigned)
There's a CL in progress to fix this @ https://chromium-review.googlesource.com/c/chromium/src/+/902191

Here's what's happening today (inspired from erikchen's post in #5):

 1) PostTaskAndReplyImpl creates a self-managed PostTaskAndReplyRelay.
 2) PostTaskAndReplyRelay owns the |task_| and |reply_| closures until they're run.
 3) PostTaskAndReplyRelay destroys itself when the reply is run.
 4) CancellableTaskTracker's handlers will always delete (per move semantics) but not always run assigned tasks/replies depending on the state of the CancellationFlag (flag set automatically on ~CancellableTaskTracker() which is tied to FaviconHandler's lifetime in this case).
 5) Not running the reply means RunReplyAndSelfDestruct() never runs which never explicitly invokes |delete this| and hence also doesn't free |PostTaskAndReplyRelay::reply_| which in this case owns the std::vector<favicon_base::FaviconRawBitmapResult> (deleting the |reply| bound in the CancellableTaskTracker doesn't help because it doesn't explicitly bind |this|).

In other words the problem is BindOnce(&PostTaskAndReplyRelay::RunReplyAndSelfDestruct, base::Unretained(this)) and requiring that RunReplyAndSelfDestruct() run in order to |delete this| (and free associated Closures' state).

A naive solution would have been to change

  void RunTaskAndPostReply() {
    std::move(task_).Run();
    origin_task_runner_->PostTask(
        from_here_, BindOnce(&PostTaskAndReplyRelay::RunReplyAndSelfDestruct,
                             base::Unretained(this)));
  }

to use base::Owned(this) instead so that |this| is always deleted, either after running the reply or if the reply is dropped.

The problem with that is that |reply_| can rightfully be thread-affine (it is constructed on the original sequence and expects to be executed/deleted there as well). It can hence rightfully assume thread-affinity (e.g. use TLS). Binding it into RunReplyAndSelfDestruct() when posting it means that if that origin_task_runner_->PostTask() fails (e.g. during shutdown), it will incorrectly be destroyed on the sequence that ran |task_| (per move semantics).

Francois' latest patch set @ https://chromium-review.googlesource.com/c/chromium/src/+/902191 solves this in a clever and clean way :), woohoo!
Very cool. Thank you Francois for tackling this hairy problem!

Comment 15 by gab@chromium.org, Feb 16 2018

Labels: ReleaseBlock-Stable
This is a big deal, it affects all combinations of PostTaskAndReply + CancellableTaskTracker (and has been live for years!). The FaviconHandler just happened to trip it the most.

There appears to be >20 callers of CancellableTaskTracker::PostTaskAndReply(WithResult) :

src/chrome/browser/themes/theme_service.cc   (1 occurrence)
src/chrome/browser/ui/webui/certificates_handler.cc  (2 occurrences)
src/components/favicon/core/large_icon_service.cc  (1 occurrence)
src/components/history/core/browser/history_service.cc   (13 occurrences)
src/components/history/core/browser/top_sites_backend.cc   (2 occurrences)
src/chrome/browser/media/router/discovery/dial/dial_service.cc   (1 occurrence)
src/chrome/browser/ui/webui/media_router/media_router_file_dialog.cc   (1 occurrence)

These are all affected by this (more or less depending on how often they trigger cancellation in practice).

Marking ReleaseBlock since we cannot go another release without fixing this now that we have a clear diagnosis.

Comment 16 by gab@chromium.org, Feb 16 2018

Summary: Memory leak when combining PostTaskAndReply with CancellableTaskTracker (was: Potential memory leak in favicon::ContentFaviconDriver::DidFinishNavigation)
Please add affected OSs.
Project Member

Comment 18 by sheriffbot@chromium.org, Feb 16 2018

This issue is marked as a release blocker with no OS labels associated. Please add an appropriate OS label.

All release blocking issues should have OS labels associated to it, so that the issue can tracked and promptly verified, once it gets fixed.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 19 by gab@chromium.org, Feb 20 2018

Labels: -ReleaseBlock-Stable OS-Android OS-Chrome OS-Fuchsia OS-iOS OS-Linux OS-Mac OS-Windows
Sigh.... it's so annoying to get these messages all the time when everything I do (in //base) affects ALL platforms and there isn't an ALL option anymore...

Dropping RBS for now as Francois made me realize I had my analysis backwards...

It's PostTaskAndReplyImpl that owns the CancellableTaskTracker's tasks, not the other way around. So even if the tasks are cancelled PostTaskAndReplyRelay::RunReplyAndSelfDestruct() should still run (running |reply_| will no-op in CancellableTaskTracker and |delete this| should occur right after and bring down the state with it)

In either case, Francois' cleanup will help think about this but we now don't expect it to resolve this issue...
Project Member

Comment 20 by bugdroid1@chromium.org, Feb 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5a6eb2075cbddbf0ce1467403020c345cb7a8d57

commit 5a6eb2075cbddbf0ce1467403020c345cb7a8d57
Author: Francois Doray <fdoray@chromium.org>
Date: Fri Feb 23 19:19:35 2018

Allow PaymentAppContextImpl destruction after ShutdownOnIO.

Today, PaymentAppContext::Shutdown() is called from
~StoragePartitionImpl() after the UI thread is done running tasks
and before the IO thread is done running tasks. ShutdownOnIO() is
successfully posted to the IO thread. After it runs, it tries to
send the DidShutdown() reply to the UI thread, which fails. Because
of how PostTaskAndReply is implemented, the PaymentAppContext is
leaked (cf.
https://cs.chromium.org/chromium/src/base/threading/post_task_and_reply_impl.cc?l=89&rcl=0fe5265959e1da13c1521fd7f5f6275a3b83c614).

We now want to change PostTaskAndReply() to avoid leaking the
reply callback and its arguments when the reply thread can't run
tasks. To make that possible, we need to change the DCHECK in
~PaymentAppContextImpl() so that it only fails if ShutdownOnIO()
didn't run, not if DidShutdown() didn't run.

Bug:  807013 
Change-Id: Ie893c70d423fed7c617cec31f01dfe1763501913
Reviewed-on: https://chromium-review.googlesource.com/929592
Reviewed-by: Rouslan Solomakhin <rouslan@chromium.org>
Commit-Queue: François Doray <fdoray@chromium.org>
Cr-Commit-Position: refs/heads/master@{#538853}
[modify] https://crrev.com/5a6eb2075cbddbf0ce1467403020c345cb7a8d57/content/browser/payments/payment_app_context_impl.cc
[modify] https://crrev.com/5a6eb2075cbddbf0ce1467403020c345cb7a8d57/content/browser/payments/payment_app_context_impl.h

Labels: Hotlist-HeapProfilingInField
What's the status of this? I see a CL landed Feb 23. Is it fixed or is there more work to do?

Comment 23 by gab@chromium.org, Mar 16 2018

fdoray has a pending CL to clean up the ownership model of PostTaskAndReply @ https://chromium-review.googlesource.com/c/chromium/src/+/902191

I think it's blocked on bad code being caught by this change, the above CL fixes one, I'm not sure if there are others. I just relaunched Dry Run on CL to see (looks like maybe HistoryModelWorkerTest.DoWorkAndWaitUntilDoneRequestStopBeforeUITaskRun is still a problem).

Overall though we don't think this is the culprit anymore (see #19), more digging is required.
Looks like this is still moving forwards, it's just a hard problem.
I see activity from yesterday on the CL from #23, so looks like it's still moving forwards.
Project Member

Comment 26 by bugdroid1@chromium.org, Apr 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/970f8498bd5860c7bb7e15aba3edf2a821b5ba23

commit 970f8498bd5860c7bb7e15aba3edf2a821b5ba23
Author: Francois Doray <fdoray@chromium.org>
Date: Wed Apr 04 13:25:54 2018

Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply.

Before, there was always a leak when the RunTaskAndPostReply callback
posted by PostTaskAndReplyImpl::PostTaskAndReply didn't run.

With this CL, the "task" is never leaked and the "reply" is only
leaked if the execution environment is shutdown before the deletion
happens (e.g. MessageLoop deleted, TaskScheduler shutdown).

Bug:  807013 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Change-Id: I05205d1b0250811abe61e2204ba32919d16c16c0
Reviewed-on: https://chromium-review.googlesource.com/902191
Reviewed-by: François Doray <fdoray@chromium.org>
Reviewed-by: Scott Violet <sky@chromium.org>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Commit-Queue: François Doray <fdoray@chromium.org>
Cr-Commit-Position: refs/heads/master@{#548034}
[modify] https://crrev.com/970f8498bd5860c7bb7e15aba3edf2a821b5ba23/base/threading/post_task_and_reply_impl.cc
[modify] https://crrev.com/970f8498bd5860c7bb7e15aba3edf2a821b5ba23/base/threading/post_task_and_reply_impl.h
[modify] https://crrev.com/970f8498bd5860c7bb7e15aba3edf2a821b5ba23/base/threading/post_task_and_reply_impl_unittest.cc
[modify] https://crrev.com/970f8498bd5860c7bb7e15aba3edf2a821b5ba23/chrome/browser/resource_coordinator/tab_manager_unittest.cc
[modify] https://crrev.com/970f8498bd5860c7bb7e15aba3edf2a821b5ba23/components/history/core/browser/history_model_worker_unittest.cc
[modify] https://crrev.com/970f8498bd5860c7bb7e15aba3edf2a821b5ba23/gpu/ipc/host/shader_disk_cache_unittest.cc

Project Member

Comment 27 by bugdroid1@chromium.org, Apr 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5fb329ce74b981b178ec111534cf45fbe53ce15c

commit 5fb329ce74b981b178ec111534cf45fbe53ce15c
Author: Darren Shen <shend@chromium.org>
Date: Wed Apr 04 23:25:33 2018

Revert "Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply."

This reverts commit 970f8498bd5860c7bb7e15aba3edf2a821b5ba23.

Reason for revert: Appears to be causing leaks in chromeos_unittests (PipeReaderTest.Cancel).

Bot: https://ci.chromium.org/buildbot/chromium.memory/Linux%20Chromium%20OS%20ASan%20LSan%20Tests%20%281%29/

Build failure: 
https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.memory%2FLinux_Chromium_OS_ASan_LSan_Tests__1_%2F26886%2F%2B%2Frecipes%2Fsteps%2Fchromeos_unittests%2F0%2Fstdout

Original change's description:
> Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply.
> 
> Before, there was always a leak when the RunTaskAndPostReply callback
> posted by PostTaskAndReplyImpl::PostTaskAndReply didn't run.
> 
> With this CL, the "task" is never leaked and the "reply" is only
> leaked if the execution environment is shutdown before the deletion
> happens (e.g. MessageLoop deleted, TaskScheduler shutdown).
> 
> Bug:  807013 
> Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
> Change-Id: I05205d1b0250811abe61e2204ba32919d16c16c0
> Reviewed-on: https://chromium-review.googlesource.com/902191
> Reviewed-by: François Doray <fdoray@chromium.org>
> Reviewed-by: Scott Violet <sky@chromium.org>
> Reviewed-by: Kenneth Russell <kbr@chromium.org>
> Reviewed-by: Gabriel Charette <gab@chromium.org>
> Commit-Queue: François Doray <fdoray@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#548034}

TBR=sky@chromium.org,gab@chromium.org,fdoray@chromium.org,kbr@chromium.org,tzik@chromium.org

Change-Id: Ib91c72333fabb4e33c1689c5ad39a5ed53ce3beb
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  807013 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Reviewed-on: https://chromium-review.googlesource.com/996732
Reviewed-by: Darren Shen <shend@chromium.org>
Commit-Queue: Darren Shen <shend@chromium.org>
Cr-Commit-Position: refs/heads/master@{#548243}
[modify] https://crrev.com/5fb329ce74b981b178ec111534cf45fbe53ce15c/base/threading/post_task_and_reply_impl.cc
[modify] https://crrev.com/5fb329ce74b981b178ec111534cf45fbe53ce15c/base/threading/post_task_and_reply_impl.h
[modify] https://crrev.com/5fb329ce74b981b178ec111534cf45fbe53ce15c/base/threading/post_task_and_reply_impl_unittest.cc
[modify] https://crrev.com/5fb329ce74b981b178ec111534cf45fbe53ce15c/chrome/browser/resource_coordinator/tab_manager_unittest.cc
[modify] https://crrev.com/5fb329ce74b981b178ec111534cf45fbe53ce15c/components/history/core/browser/history_model_worker_unittest.cc
[modify] https://crrev.com/5fb329ce74b981b178ec111534cf45fbe53ce15c/gpu/ipc/host/shader_disk_cache_unittest.cc

Project Member

Comment 28 by bugdroid1@chromium.org, Apr 6 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8e10a0af8da100dcb25c92afe395805ca01e577d

commit 8e10a0af8da100dcb25c92afe395805ca01e577d
Author: Francois Doray <fdoray@chromium.org>
Date: Fri Apr 06 20:36:45 2018

Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply (reland).

This CL first landed as https://chromium-review.googlesource.com/902191.
It was reverted because of a leak in PipeReaderTest.Cancel, which is
fixed by this CL by flushing the ScopedTaskEnvironment before the test
ends.

Before, there was always a leak when the RunTaskAndPostReply callback
posted by PostTaskAndReplyImpl::PostTaskAndReply didn't run.

With this CL, the "task" is never leaked and the "reply" is only
leaked if the execution environment is shutdown before the deletion
happens (e.g. MessageLoop deleted, TaskScheduler shutdown).

TBR=gab@chromium.org,sky@chromium.org,kbr@chromium.org

Bug:  807013 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Change-Id: Ic28839da7aed25fc56a4abadc565b7f4a6e0b5c5
Reviewed-on: https://chromium-review.googlesource.com/997892
Reviewed-by: François Doray <fdoray@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
Commit-Queue: François Doray <fdoray@chromium.org>
Cr-Commit-Position: refs/heads/master@{#548919}
[modify] https://crrev.com/8e10a0af8da100dcb25c92afe395805ca01e577d/base/threading/post_task_and_reply_impl.cc
[modify] https://crrev.com/8e10a0af8da100dcb25c92afe395805ca01e577d/base/threading/post_task_and_reply_impl.h
[modify] https://crrev.com/8e10a0af8da100dcb25c92afe395805ca01e577d/base/threading/post_task_and_reply_impl_unittest.cc
[modify] https://crrev.com/8e10a0af8da100dcb25c92afe395805ca01e577d/chrome/browser/resource_coordinator/tab_manager_unittest.cc
[modify] https://crrev.com/8e10a0af8da100dcb25c92afe395805ca01e577d/chromeos/dbus/pipe_reader_unittest.cc
[modify] https://crrev.com/8e10a0af8da100dcb25c92afe395805ca01e577d/components/history/core/browser/history_model_worker_unittest.cc
[modify] https://crrev.com/8e10a0af8da100dcb25c92afe395805ca01e577d/gpu/ipc/host/shader_disk_cache_unittest.cc

Comment 29 by gab@chromium.org, Apr 10 2018

Blockedon: 829122
Project Member

Comment 30 by bugdroid1@chromium.org, Apr 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0e2e01d06ec304d9a9a014d8a6054ce448c4583c

commit 0e2e01d06ec304d9a9a014d8a6054ce448c4583c
Author: Gabriel Charette <gab@chromium.org>
Date: Tue Apr 10 02:56:58 2018

Revert "Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply (reland)."

This reverts commit 8e10a0af8da100dcb25c92afe395805ca01e577d.

Reason for revert: surfaces flakes in browser_tests, see  crbug.com/829122 
(only reverting base/ changes, test fixes can stay in)

Original change's description:
> Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply (reland).
>
> This CL first landed as https://chromium-review.googlesource.com/902191.
> It was reverted because of a leak in PipeReaderTest.Cancel, which is
> fixed by this CL by flushing the ScopedTaskEnvironment before the test
> ends.
>
> Before, there was always a leak when the RunTaskAndPostReply callback
> posted by PostTaskAndReplyImpl::PostTaskAndReply didn't run.
>
> With this CL, the "task" is never leaked and the "reply" is only
> leaked if the execution environment is shutdown before the deletion
> happens (e.g. MessageLoop deleted, TaskScheduler shutdown).
>
> TBR=gab@chromium.org,sky@chromium.org,kbr@chromium.org
>
> Bug:  807013 
> Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
> Change-Id: Ic28839da7aed25fc56a4abadc565b7f4a6e0b5c5
> Reviewed-on: https://chromium-review.googlesource.com/997892
> Reviewed-by: François Doray <fdoray@chromium.org>
> Reviewed-by: Gabriel Charette <gab@chromium.org>
> Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
> Commit-Queue: François Doray <fdoray@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#548919}

TBR=stevenjb@chromium.org,sky@chromium.org,gab@chromium.org,fdoray@chromium.org,kbr@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug:  807013 ,  829122 
Change-Id: Iff4ca23f3cf011b0587478f479a1676e618b741b
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Reviewed-on: https://chromium-review.googlesource.com/1003792
Commit-Queue: Gabriel Charette <gab@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Cr-Commit-Position: refs/heads/master@{#549398}
[modify] https://crrev.com/0e2e01d06ec304d9a9a014d8a6054ce448c4583c/base/threading/post_task_and_reply_impl.cc
[modify] https://crrev.com/0e2e01d06ec304d9a9a014d8a6054ce448c4583c/base/threading/post_task_and_reply_impl.h
[modify] https://crrev.com/0e2e01d06ec304d9a9a014d8a6054ce448c4583c/base/threading/post_task_and_reply_impl_unittest.cc

Project Member

Comment 31 by bugdroid1@chromium.org, Apr 12 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d3da450283253871dd1664290877990680bf2859

commit d3da450283253871dd1664290877990680bf2859
Author: Francois Doray <fdoray@chromium.org>
Date: Thu Apr 12 14:30:04 2018

Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply (reland #2).

This CL previously landed as
https://chromium-review.googlesource.com/997892. It was reverted because
of an access race in ChromeStorageImpl  https://crbug.com/829122 . This
CL fixes the issue by deleting both callbacks on the origin sequence
when they can't run.

Before, there was always leak when an internal task posted by
PostTaskAndReplyImpl didn't run. With this CL, PostTaskAndReplyImpl
notices when its internal callbacks aren't scheduled and schedules
deletion of all its internal state (including callbacks that didn't
run) on the origin sequence via a DeleteSoon(). Note that a deletion
scheduled via DeleteSoon() might not happen if Chrome is shutting
down.

R=gab@chromium.org

Bug:  807013 
Change-Id: I17abffd3e7afcc861c8c92dce30c14ce8629b125
Reviewed-on: https://chromium-review.googlesource.com/1006045
Commit-Queue: François Doray <fdoray@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Cr-Commit-Position: refs/heads/master@{#550189}
[modify] https://crrev.com/d3da450283253871dd1664290877990680bf2859/base/threading/post_task_and_reply_impl.cc
[modify] https://crrev.com/d3da450283253871dd1664290877990680bf2859/base/threading/post_task_and_reply_impl.h
[modify] https://crrev.com/d3da450283253871dd1664290877990680bf2859/base/threading/post_task_and_reply_impl_unittest.cc

I landed a CL that makes ownership clearer in PostTaskAndReply 4 days ago. Is it possible to know what's the status of the leak described in the original bug description now?
Project Member

Comment 33 by bugdroid1@chromium.org, Apr 17 2018

Labels: merge-merged-testbranch
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d3da450283253871dd1664290877990680bf2859

commit d3da450283253871dd1664290877990680bf2859
Author: Francois Doray <fdoray@chromium.org>
Date: Thu Apr 12 14:30:04 2018

Reduce leaks in PostTaskAndReplyImpl::PostTaskAndReply (reland #2).

This CL previously landed as
https://chromium-review.googlesource.com/997892. It was reverted because
of an access race in ChromeStorageImpl  https://crbug.com/829122 . This
CL fixes the issue by deleting both callbacks on the origin sequence
when they can't run.

Before, there was always leak when an internal task posted by
PostTaskAndReplyImpl didn't run. With this CL, PostTaskAndReplyImpl
notices when its internal callbacks aren't scheduled and schedules
deletion of all its internal state (including callbacks that didn't
run) on the origin sequence via a DeleteSoon(). Note that a deletion
scheduled via DeleteSoon() might not happen if Chrome is shutting
down.

R=gab@chromium.org

Bug:  807013 
Change-Id: I17abffd3e7afcc861c8c92dce30c14ce8629b125
Reviewed-on: https://chromium-review.googlesource.com/1006045
Commit-Queue: François Doray <fdoray@chromium.org>
Reviewed-by: Gabriel Charette <gab@chromium.org>
Cr-Commit-Position: refs/heads/master@{#550189}
[modify] https://crrev.com/d3da450283253871dd1664290877990680bf2859/base/threading/post_task_and_reply_impl.cc
[modify] https://crrev.com/d3da450283253871dd1664290877990680bf2859/base/threading/post_task_and_reply_impl.h
[modify] https://crrev.com/d3da450283253871dd1664290877990680bf2859/base/threading/post_task_and_reply_impl_unittest.cc

Owner: erikc...@chromium.org
erikchen@chromium.org: Is this memory leak still seen in the wild?
Cc: erikc...@chromium.org
Owner: etienneb@chromium.org
etienneb is working on a script that will automatically report bug frequency per signature.

Comment 36 by gab@chromium.org, May 9 2018

Components: -Internals>TaskScheduler Internals
This might be task related but it isn't a TaskScheduler bug per se (removing component relationship as part of triaging it).
Do we have data from the wild yet?
I'm not seeing any issues with this signature anymore, so I think this issue pretty much no longer occurs with noticeable frequency on Canary.
Status: Fixed (was: Started)

Sign in to add a comment