New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 780735 link

Starred by 6 users

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: 2017-11-13
OS: Windows
Pri: 1
Type: Bug
Hotlist-MemoryInfra



Sign in to add a comment

Task Manager makes a lot of expensive MallocDumpProvider::OnMemoryDump

Project Member Reported by sebmarchand@chromium.org, Nov 2 2017

Issue description

I realized that on Windows opening the Task Manager result in a lot of expensive calls to MallocDumpProvider::OnMemoryDump (see attached screenshot).

Here's the stack trace of what initiates the OnMemoryDump call:

base!base::trace_event::MemoryDumpManager::SetupNextMemoryDump+0x1f4 [base\trace_event\memory_dump_manager.cc @ 610] 
base!base::trace_event::MemoryDumpManager::CreateProcessDump+0x528 [base\trace_event\memory_dump_manager.cc @ 568] 
resource_coordinator_cpp!memory_instrumentation::ClientProcessImpl::RequestChromeMemoryDump+0x18a [services\resource_coordinator\public\cpp\memory_instrumentation\client_process_impl.cc @ 78] 
resource_coordinator_cpp!memory_instrumentation::mojom::ClientProcessStubDispatch::AcceptWithResponder+0x35c [out\debug\gen\services\resource_coordinator\public\interfaces\memory_instrumentation\memory_instrumentation.mojom.cc @ 986] 
resource_coordinator_cpp!memory_instrumentation::mojom::ClientProcessStub<mojo::RawPtrImplRefTraits<memory_instrumentation::mojom::ClientProcess> >::AcceptWithResponder+0xb3 [out\debug\gen\services\resource_coordinator\public\interfaces\memory_instrumentation\memory_instrumentation.mojom.h @ 319] 
bindings!mojo::InterfaceEndpointClient::HandleValidatedMessage+0x401 [mojo\public\cpp\bindings\lib\interface_endpoint_client.cc @ 393] 
bindings!mojo::InterfaceEndpointClient::HandleIncomingMessageThunk::Accept+0x20 [mojo\public\cpp\bindings\lib\interface_endpoint_client.cc @ 132] 
bindings!mojo::FilterChain::Accept+0x1e9 [mojo\public\cpp\bindings\lib\filter_chain.cc @ 40] 
bindings!mojo::InterfaceEndpointClient::HandleIncomingMessage+0x104 [mojo\public\cpp\bindings\lib\interface_endpoint_client.cc @ 305] 
bindings!mojo::internal::MultiplexRouter::ProcessIncomingMessage+0x775 [mojo\public\cpp\bindings\lib\multiplex_router.cc @ 879] 
bindings!mojo::internal::MultiplexRouter::Accept+0x2dc [mojo\public\cpp\bindings\lib\multiplex_router.cc @ 604] 
bindings!mojo::FilterChain::Accept+0x1e9 [mojo\public\cpp\bindings\lib\filter_chain.cc @ 40] 
bindings!mojo::Connector::ReadSingleMessage+0x3f7 [mojo\public\cpp\bindings\lib\connector.cc @ 439] 
bindings!mojo::Connector::ReadAllAvailableMessages+0x7d [mojo\public\cpp\bindings\lib\connector.cc @ 469] 
bindings!mojo::Connector::OnHandleReadyInternal+0x123 [mojo\public\cpp\bindings\lib\connector.cc @ 376] 
bindings!mojo::Connector::OnWatcherHandleReady+0x1d [mojo\public\cpp\bindings\lib\connector.cc @ 352] 
bindings!base::internal::FunctorTraits<void (mojo::Connector::*)(unsigned int) __attribute__((thiscall)),void>::Invoke<mojo::Connector *,unsigned int>+0x42 [base\bind_internal.h @ 194] 
bindings!base::internal::InvokeHelper<0,void>::MakeItSo<void (mojo::Connector::*const &)(unsigned int) __attribute__((thiscall)),mojo::Connector *,unsigned int>+0x69 [base\bind_internal.h @ 277] 
bindings!base::internal::Invoker<base::internal::BindState<void (mojo::Connector::*)(unsigned int) __attribute__((thiscall)),base::internal::UnretainedWrapper<mojo::Connector> >,void (unsigned int)>::RunImpl<void (mojo::Connector::*const &)(unsigned int) __attribute__((thiscall)),const std::tuple<base::internal::UnretainedWrapper<mojo::Connector> > &,0>+0x6f [base\bind_internal.h @ 349] 
bindings!base::internal::Invoker<base::internal::BindState<void (mojo::Connector::*)(unsigned int) __attribute__((thiscall)),base::internal::UnretainedWrapper<mojo::Connector> >,void (unsigned int)>::Run+0x53 [base\bind_internal.h @ 331] 
mojo_public_system_cpp!base::RepeatingCallback<void (unsigned int)>::Run+0x56 [base\callback.h @ 92] 
mojo_public_system_cpp!mojo::SimpleWatcher::DiscardReadyState+0x2a [mojo\public\cpp\system\simple_watcher.h @ 194] 
mojo_public_system_cpp!base::internal::FunctorTraits<void (*)(const base::RepeatingCallback<void (unsigned int)> &, unsigned int, const mojo::HandleSignalsState &),void>::Invoke<const base::RepeatingCallback<void (unsigned int)> &,unsigned int,const mojo::HandleSignalsState &>+0x66 [base\bind_internal.h @ 149] 
mojo_public_system_cpp!base::internal::InvokeHelper<0,void>::MakeItSo<void (*const &)(const base::RepeatingCallback<void (unsigned int)> &, unsigned int, const mojo::HandleSignalsState &),const base::RepeatingCallback<void (unsigned int)> &,unsigned int,const mojo::HandleSignalsState &>+0x74 [base\bind_internal.h @ 277] 
mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (*)(const base::RepeatingCallback<void (unsigned int)> &, unsigned int, const mojo::HandleSignalsState &),base::RepeatingCallback<void (unsigned int)> >,void (unsigned int, const mojo::HandleSignalsState &)>::RunImpl<void (*const &)(const base::RepeatingCallback<void (unsigned int)> &, unsigned int, const mojo::HandleSignalsState &),const std::tuple<base::RepeatingCallback<void (unsigned int)> > &,0>+0x82 [base\bind_internal.h @ 349] 
mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (*)(const base::RepeatingCallback<void (unsigned int)> &, unsigned int, const mojo::HandleSignalsState &),base::RepeatingCallback<void (unsigned int)> >,void (unsigned int, const mojo::HandleSignalsState &)>::Run+0x6e [base\bind_internal.h @ 331] 
mojo_public_system_cpp!base::RepeatingCallback<void (unsigned int, const mojo::HandleSignalsState &)>::Run+0x72 [base\callback.h @ 92] 
mojo_public_system_cpp!mojo::SimpleWatcher::OnHandleReady+0x1da [mojo\public\cpp\system\simple_watcher.cc @ 276] 
mojo_public_system_cpp!base::internal::FunctorTraits<void (mojo::SimpleWatcher::*)(int, unsigned int, const mojo::HandleSignalsState &) __attribute__((thiscall)),void>::Invoke<const base::WeakPtr<mojo::SimpleWatcher> &,const int &,const unsigned int &,const mojo::HandleSignalsState &>+0x7f [base\bind_internal.h @ 194] 
mojo_public_system_cpp!base::internal::InvokeHelper<1,void>::MakeItSo<void (mojo::SimpleWatcher::*const &)(int, unsigned int, const mojo::HandleSignalsState &) __attribute__((thiscall)),const base::WeakPtr<mojo::SimpleWatcher> &,const int &,const unsigned int &,const mojo::HandleSignalsState &>+0xb8 [base\bind_internal.h @ 300] 
mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (mojo::SimpleWatcher::*)(int, unsigned int, const mojo::HandleSignalsState &) __attribute__((thiscall)),base::WeakPtr<mojo::SimpleWatcher>,int,unsigned int,mojo::HandleSignalsState>,void ()>::RunImpl<void (mojo::SimpleWatcher::*const &)(int, unsigned int, const mojo::HandleSignalsState &) __attribute__((thiscall)),const std::tuple<base::WeakPtr<mojo::SimpleWatcher>,int,unsigned int,mojo::HandleSignalsState> &,0,1,2,3>+0xb9 [base\bind_internal.h @ 349] 
mojo_public_system_cpp!base::internal::Invoker<base::internal::BindState<void (mojo::SimpleWatcher::*)(int, unsigned int, const mojo::HandleSignalsState &) __attribute__((thiscall)),base::WeakPtr<mojo::SimpleWatcher>,int,unsigned int,mojo::HandleSignalsState>,void ()>::Run+0x3e [base\bind_internal.h @ 331] 
base!base::OnceCallback<void ()>::Run+0x4e [base\callback.h @ 64] 
base!base::debug::TaskAnnotator::RunTask+0x3f5 [base\debug\task_annotator.cc @ 57] 
base!base::internal::IncomingTaskQueue::RunTask+0xfd [base\message_loop\incoming_task_queue.cc @ 130] 
base!base::MessageLoop::RunTask+0x3ed [base\message_loop\message_loop.cc @ 394] 
base!base::MessageLoop::DeferOrRunPendingTask+0x51 [base\message_loop\message_loop.cc @ 409] 
base!base::MessageLoop::DoWork+0x1c6 [base\message_loop\message_loop.cc @ 450] 
base!base::MessagePumpForUI::DoRunLoop+0x3e [base\message_loop\message_pump_win.cc @ 173] 
base!base::MessagePumpWin::Run+0xc1 [base\message_loop\message_pump_win.cc @ 58] 
base!base::MessageLoop::Run+0x22f [base\message_loop\message_loop.cc @ 345] 
base!base::RunLoop::Run+0x219 [base\run_loop.cc @ 114] 
chrome!ChromeBrowserMainParts::MainMessageLoopRun+0x18e [chrome\browser\chrome_browser_main.cc @ 1917] 
content!content::BrowserMainLoop::RunMainMessageLoopParts+0x1df [content\browser\browser_main_loop.cc @ 1205] 
content!content::BrowserMainRunnerImpl::Run+0x19a [content\browser\browser_main_runner.cc @ 141] 
content!content::BrowserMain+0xca [content\browser\browser_main.cc @ 46] 
content!content::RunNamedProcessTypeMain+0xd1 [content\app\content_main_runner.cc @ 427] 
content!content::ContentMainRunnerImpl::Run+0x292 [content\app\content_main_runner.cc @ 705] 
content!content::ContentServiceManagerMainDelegate::RunEmbedderProcess+0x21 [content\app\content_service_manager_main_delegate.cc @ 51] 
embedder!service_manager::Main+0x5fe [services\service_manager\embedder\main.cc @ 456] 
content!content::ContentMain+0x4a [content\app\content_main.cc @ 19] 
chrome!ChromeMain+0x1f5 [chrome\app\chrome_main.cc @ 123] 
chrome_exe!MainDllLoader::Launch+0x49a [chrome\app\main_dll_loader_win.cc @ 199] 
chrome_exe!wWinMain+0x553 [chrome\app\chrome_exe_main_win.cc @ 230] 
chrome_exe!invoke_main+0x1e [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 123] 
chrome_exe!__scrt_common_main_seh+0x150 [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 283] 
chrome_exe!__scrt_common_main+0xd [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 326] 
chrome_exe!wWinMainCRTStartup+0x8 [f:\dd\vctools\crt\vcstartup\src\startup\exe_wwinmain.cpp @ 17] 
KERNEL32!BaseThreadInitThunk+0x24
ntdll_776f0000!__RtlUserThreadStart+0x2f
ntdll_776f0000!_RtlUserThreadStart+0x1b


I didn't had time to look at this more closely. 

 
image.png
110 KB View Download

Comment 1 by w...@chromium.org, Nov 2 2017

That looks most likely to be Task Manager requesting details of JavaScript
(and maybe GPU?) memory from the renderer itself. The stack doesn't really
show much; it's just a Mojo request being processed, I think?

Have you tried disabling most of the columns in the manager, so it won't
ask for these details?

Comment 2 by w...@chromium.org, Nov 2 2017

As per primiano@'s comment on IRC, calling the MemoryDumpManager is working-as-intended, but it shouldn't be doing expensive work, just gathering some high-level stats from each process, by the sounds of it.
Cc: ajwong@chromium.org brucedaw...@chromium.org
Components: Internals>Instrumentation>Memory
So I looked at the code and it seems that in MallocDumpProvider::OnMemoryDump we assume that walking WinHeap is as cheap as mallinfo() on Linux/Mac.
My understanding from this bug is that this is not true, so WinHeapMemoryDumpImpl() is expensive when we assume it shouldn't be.

I'm not a windows expert at all. Is there a way to get an equivalent of mallinfo() on windows, like totals for the winheap allocations, without having to walk them one by one?
Today the code that we use is this:
https://cs.chromium.org/chromium/src/base/trace_event/malloc_dump_provider.cc?rcl=137f37a0077cd922da37338d79e51d65a428ff3a&l=156

Ccing a bunch of people who might know the answer.
Labels: -Pri-3 Pri-2
Issue 728552 has been merged into this issue.
Labels: -Pri-2 Pri-1
Issue 752717 has been merged into this issue.
The lesson i'm learning from the other bugs is that other than being slow, this heap walk function seems also unreliable in terms of stability.
We should really figure out some alternative.
Unfortunately I am not aware of any equivalent to mallinfo on Windows. I don't think the Windows heap tracks this information and therefore it can't provide it.

And, unfortunately, traversing the heap can be expensive. If the heap is entirely in memory then traversing it is probably not *too* bad, but if just a couple of metadata pages have been paged out then it can take tens or hundreds of ms, or potentially arbitrarily long.

I still don't understand why walking the heap causes stability problems, but that has been our experience.

It is faster and more reliable to iterate over the entire process address space, finding all committed memory, than to traverse just the heap. Unfortunately this gives us quite different information.

Sorry, no good news here.

Comment 10 by lfg@chromium.org, Nov 6 2017

Labels: ReleaseBlock-Dev
Owner: erikc...@chromium.org
Status: Assigned (was: Untriaged)
I'm having the same issue -- the browser is unusable while the task manager is opened (see attached screenshot). There's almost 1 second of work going around the heap, and the data is refreshed every second.

I'm marking this ReleaseBlock-Dev, we should revert the change that introduced this (d5b5df6a535bd73a34cb2149ea090579e744493c).

erikchen@, can you revert the change above?

taskmanager.png
182 KB View Download

Comment 11 by lfg@chromium.org, Nov 6 2017

Cc: nick@chromium.org

Comment 12 by nick@chromium.org, Nov 6 2017

Agree with lfg's revert conclusion. That this is spending seconds with the heap locked is not acceptable.
FYI, chatted with Primiano offline. This stat isn't required in task manager and we can disable.  Give us a a day?
Cc: ssid@chromium.org
Okay this seems to invasive. I think we have no other choice than losing malloc data in UMA/UKM.
TaskManager doesn't worry me too much, what worries me more at this point is the UMA ping (which does the same)

Can somebody put the code in #elif defined(OS_WIN) in  here [1] behind a:
-----
// This is too expensive on Windows,  crbug.com/780735  .
if (args.level_of_detail == DETAILED) {
  ..
}
-----

I don't have a corp device right now.
We should also merge it back.

[1] 
https://cs.chromium.org/chromium/src/base/trace_event/malloc_dump_provider.cc?rcl=9ddecf04a93bdfdc24464e460b61515a3e349052&l=233
I separately just noticed this on my Z840 - browser process CPU usage was high with task-manager open (per-processor scaling instead of per-machine scaling of CPU usage made it more obvoius, thanks wez@). I grabbed an ETW trace just to double-check what I was seeing. It shows ~58 ms per call. Since this is on a fast machine on a just-started browser this is presumably the best case - long enough to cause lots of hitches, especially due to the heap being locked. Pity.

I'm assuming ajwong@ or someone will disable this? If not then I can.

Sorry, I have  I idea with all this tech stuff :D the issue of crashing
chrome when up or downloading remains (except when upload via drag&drop) ,
I did all the updates etc. Is there a solution yet? ^^

Am 07.11.2017 02:07 schrieb "brucedaw… via monorail" <
monorail+v2.1410982004@chromium.org>:

Comment 17 by lfg@chromium.org, Nov 7 2017

Labels: -ReleaseBlock-Dev ReleaseBlock-Beta
Yes, I think it's fine if we need a few days to get a better workaround.

To add some more context: In my capture where I was seeing almost 1 sec of work walking the heap, the browser process was using just over 900MBs of memory (this was about a 1 week old instance), it's not as bad in a fresh new instance, where the browser is under 30MBs of memory.

However, we should still try to address this relatively quickly, as it can be quite painful in some cases. I'll change it to ReleaseBlock-Beta so we don't block the rollout of this week's dev channel over this.

Project Member

Comment 18 by sheriffbot@chromium.org, Nov 7 2017

Cc: lfg@chromium.org
This issue is marked as a release blocker with no milestone associated. Please add an appropriate milestone.

All release blocking issues should have milestones associated to it, so that the issue can tracked and the fixes can be pushed promptly.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 19 by lfg@chromium.org, Nov 7 2017

Labels: M-64
Status: Started (was: Assigned)
Primiano said: "We should also merge it back.", presumably to M-63, so I think the ReleaseBlock should be tagged as M-63, to make sure we don't accidentally ship this to stable in a few weeks.
Labels: -M-64 M-63
Hm..we can merge back to M-63, but the code in question has been around for quite some time, so any code that uses memory_instrumentation for UMAs [basically, ProcessMemoryMetricsEmitter, which goes back several releases] will have this issue.
We already cut M63 Beta RC for release tomorrow. Is this M63 Beta blocker or Stable blocker?
I believe it should be M63 stable blocker, not M63 beta blocker.
Labels: -ReleaseBlock-Beta ReleaseBlock-Stable
Thank you  brucedawson@. Moving to "RBS".
Project Member

Comment 26 by bugdroid1@chromium.org, Nov 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/99e816790db36cf7b225bf726f1ecb2d7e41a177

commit 99e816790db36cf7b225bf726f1ecb2d7e41a177
Author: Erik Chen <erikchen@chromium.org>
Date: Wed Nov 08 23:15:07 2017

Don't grab malloc stats for memory_instrumentation service on Windows.

Only do so for DETAILED dumps. The call is expensive [~70ms per process].

Bug:  780735 
Change-Id: I757c7e4359afe379561788420c9e257c2275174d
Reviewed-on: https://chromium-review.googlesource.com/757296
Commit-Queue: Erik Chen <erikchen@chromium.org>
Reviewed-by: Alexei Svitkine <asvitkine@chromium.org>
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Cr-Commit-Position: refs/heads/master@{#514993}
[modify] https://crrev.com/99e816790db36cf7b225bf726f1ecb2d7e41a177/base/trace_event/malloc_dump_provider.cc
[modify] https://crrev.com/99e816790db36cf7b225bf726f1ecb2d7e41a177/chrome/browser/metrics/process_memory_metrics_emitter.cc
[modify] https://crrev.com/99e816790db36cf7b225bf726f1ecb2d7e41a177/chrome/browser/metrics/process_memory_metrics_emitter_browsertest.cc
[modify] https://crrev.com/99e816790db36cf7b225bf726f1ecb2d7e41a177/chrome/browser/metrics/process_memory_metrics_emitter_unittest.cc

Labels: Merge-Request-63
Status: Fixed (was: Started)
Before we approve merge to M63, could you pls confirm well baked/verified in Canary, having enough automation tests coverage and safe to merge? Thank you.
We'll let this bake 1-2 more days in Canary. It should be safe [just removing some code], and the corresponding tests have been updated in the same CL.
NextAction: 2017-11-13
Thank you Erik. Please update the bug with canary result on Monday morning.
Project Member

Comment 31 by sheriffbot@chromium.org, Nov 9 2017

Labels: -Merge-Request-63 Merge-Review-63 Hotlist-Merge-Review
This bug requires manual review: M63 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), cmasso@(iOS), gkihumba@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
The NextAction date has arrived: 2017-11-13
Haven't seen any issues, let's merge it.
Labels: -Merge-Review-63 Merge-Approved-63
Approving merge to M63 branch 3239 based on comment #33. Please merge today before 4:00 PM PT so we can take it in for this week Beta release. Thank you.
Project Member

Comment 35 by bugdroid1@chromium.org, Nov 13 2017

Labels: -merge-approved-63 merge-merged-3239
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e1703d44558465d8cb2610fbd39c4ccd64c0720d

commit e1703d44558465d8cb2610fbd39c4ccd64c0720d
Author: erikchen <erikchen@chromium.org>
Date: Mon Nov 13 18:41:45 2017

[merge to 3239] Don't grab malloc stats for memory_instrumentation service on Windows.

Only do so for DETAILED dumps. The call is expensive [~70ms per process].

TBR=erikchen@chromium.org

(cherry picked from commit 99e816790db36cf7b225bf726f1ecb2d7e41a177)

Bug:  780735 
Change-Id: I757c7e4359afe379561788420c9e257c2275174d
Reviewed-on: https://chromium-review.googlesource.com/757296
Commit-Queue: Erik Chen <erikchen@chromium.org>
Reviewed-by: Alexei Svitkine <asvitkine@chromium.org>
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#514993}
Reviewed-on: https://chromium-review.googlesource.com/766478
Reviewed-by: Erik Chen <erikchen@chromium.org>
Cr-Commit-Position: refs/branch-heads/3239@{#463}
Cr-Branched-From: adb61db19020ed8ecee5e91b1a0ea4c924ae2988-refs/heads/master@{#508578}
[modify] https://crrev.com/e1703d44558465d8cb2610fbd39c4ccd64c0720d/base/trace_event/malloc_dump_provider.cc
[modify] https://crrev.com/e1703d44558465d8cb2610fbd39c4ccd64c0720d/chrome/browser/metrics/process_memory_metrics_emitter.cc
[modify] https://crrev.com/e1703d44558465d8cb2610fbd39c4ccd64c0720d/chrome/browser/metrics/process_memory_metrics_emitter_browsertest.cc
[modify] https://crrev.com/e1703d44558465d8cb2610fbd39c4ccd64c0720d/chrome/browser/metrics/process_memory_metrics_emitter_unittest.cc

Sign in to add a comment