chrome_child.dll!content::RenderFrameImpl::UpdatePeakMemoryStats is taking a lot of CPU time |
||||||||
Issue description
Chrome Version : 61.0.3163.14
OS Version: 10.0
URLs (if applicable) :
Other browsers tested:
Add OK or FAIL after other browsers where you have tested this issue:
Safari 5:
Firefox 4.x:
IE 7/8/9:
What steps will reproduce the problem?
1. Open Inbox. A corp account will probably help exacerbate the problem.
2. Try typing.
What is the expected result?
Typing should not be janky.
What happens instead of that?
Typing is janky.
Please provide any additional information below. Attach a screenshot if
possible.
brucedawson@ helped debug this. It looks like Inbox results in many calls to CreateURLLoader: ~3500 calls in ~8.5 seconds. As of https://chromium-review.googlesource.com/c/569327, CreateURLLoader() calls ReportPeakMemoryStats(), and it looks like base::ProcessMetrics::GetMallocUsage is accounting for the majority of time in this function, since it's walking the heap.
UserAgentString: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.14 Safari/537.36
,
Aug 3 2017
To clarify the meaning of the screenshot for non-ETW-experts, it says that 3472 samples (representing ~3,468 ms) were recorded in UpdatePeakMemoryStats and its descendants, over an 8.48 s time period. So that's a lot.
,
Aug 3 2017
Thanks for identifying this!! I think we should probably call this only once for some reasonable interval?
,
Aug 3 2017
Given the power and performance implications of this it would seem appropriate to measure the CPU cost of this function before deciding how frequently we can run it without regressing. The cost is likely to vary depending on how much data is in the CRT heap, so run tests in a range of scenarios, with the heap well loaded up (I'm not sure how to do that). Note that if this code runs when the system is under memory pressure then walking the heap will require paging in memory - paging in all of the heap. If this code runs on multiple tabs simultaneously then it could convert a benign low-memory paged-out situation and turn it into a page-storm.
,
Aug 3 2017
FYI, this is a 5% performance regression on the speedometer VanillaJS bechmark. I'm wondering if the perf trybots didn't catch this, and if it is because the feature is controlled via a finch experiment.
,
Aug 3 2017
keishi@: Any thoughts on this?
,
Aug 3 2017
I looked at the trace to see how wide (in ms) the blocks of samples associated with UpdatePeakMemoryStats are. It looks like the are 70-77 ms, followed by a gap with no samples. This is not quite proof but is a very strong indicator that each call to UpdatePeakMemoryStats is taking 70-77 ms. The actual time will vary depending on the state of the heap (number of allocations, whether it has been paged out) and the system load. Note that this trace was taken on a very beefy machine.
,
Aug 3 2017
Barring any strong arguments against I think we need to revert crrev.com/c/569327. The cost (~10 calls per second to UpdatePeakMemoryStats when typing in Inbox chat window, each costing ~70 ms on a fast machine) is far too great and I don't think we want to experiment with mitigation changes in the release branch.
,
Aug 3 2017
Re #8: This particular case that gets called on UpdatePeakMemoryStats is behind a finch experiment though, and the experiment is only enabled on Canary/Dev channels. I'm not sure if we need to rush reverting it, but it certainly shouldn't be enabled long-term. There's another place where we do full heap walks in RenderFrameImpl::DidFinishLoad(), and that's already in the stable channel. Just removing the call to GetRendererMemoryMetrics yields ~2% improvement on the speedometer benchmark.
,
Aug 4 2017
,
Aug 4 2017
Yes this is controlled with a finch flag (and it can be enabled/disabled independently of the ResourceLoadScheduler experiment itself). Probably we should disable the flag on Windows.
,
Aug 4 2017
Issue 749405 has been merged into this issue.
,
Aug 4 2017
We have three users for GetRendererMemoryMetrics - keishi@ using it for general memory usage UMA - tasak@ using it for PurgeAndSuspend UMA - ksakamoto@ using it for RequestThrottling finch The three of use talked and decided to disable malloc usage reporting on Windows. The heap walk is also causing crashes and we think it is best to remove it.
,
Aug 4 2017
,
Aug 4 2017
That's odd that the heap walking is crashing. The heap is locked during the HeapWalk (which will, BTW, block other threads during GetMallocUsage) which should make heap walking a safe and robust operation. The crashes suggest that something bad is going on. Either the heap is corrupt or else some threads are doing allocations despite the heap being locked. Either possibility is worrisome.
,
Aug 4 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8cc2c3abcda7a8e3e33c1927ac07afbd218e01e7 commit 8cc2c3abcda7a8e3e33c1927ac07afbd218e01e7 Author: Keishi Hattori <keishi@chromium.org> Date: Fri Aug 04 20:55:30 2017 Remove support for GetMallocUsage on Windows Getting malloc usage on Windows requires a heap walk which is very slow and causes random crashes. This CL disables it. Bug: 751896 Change-Id: I71d9edd4d18deef1259f0bd3d78f351968bf650c Reviewed-on: https://chromium-review.googlesource.com/601810 Reviewed-by: Daniel Cheng <dcheng@chromium.org> Commit-Queue: Keishi Hattori <keishi@chromium.org> Cr-Commit-Position: refs/heads/master@{#492118} [modify] https://crrev.com/8cc2c3abcda7a8e3e33c1927ac07afbd218e01e7/base/process/process_metrics_win.cc
,
Aug 9 2017
[Bulk Edit] URGENT - PTAL. Your bug is labelled as M61 Stable ReleaseBlock, pls make sure to land the fix and get it merged into the release branch ASAP. Know that this issue shouldn't block the release? Remove the ReleaseBlock-Stable label. Thank you.
,
Aug 10 2017
merge should not be needed because the study is not enabled and won't be enabled on Beta and Stable channels. I think we can just close this bug with keishi's last CL, IIUC. keishi: is my understanding correct? Also, we could not answer brucedawson's question at #c15. If something seems wrong on the heap walk, would we file another dedicated bug for that? Since this bug is filed as ReleaseBlock-Stable, we should not keep it open for a long time.
,
Aug 10 2017
Yes I am not a Windows person but according to a Google search other people seem to have crashing issues when using heap walk too.
,
Aug 10 2017
https://cs.chromium.org/chromium/src/base/trace_event/malloc_dump_provider.cc?q=HeapUnlock&dr=C&l=164 This existing code does not check HeapLock result, and calls HeapUnlock later. If this code run on a different thread while GetMallocUsage runs, this could unlock the heap handle while the heap walk is running inside GetMallocUsage, and could cause a crash?
,
Aug 14 2017
Anyway, let me close this now. Feel free to reopen of file another bug for unresolved things.
,
Aug 14 2017
[Auto-generated comment by a script] We noticed that this issue is targeted for M-61; it appears the fix may have landed after branch point, meaning a merge might be required. Please confirm if a merge is required here - if so add Merge-Request-61 label, otherwise remove Merge-TBD label. Thanks.
,
Aug 14 2017
merge should not be needed (see comment 18)
,
Aug 14 2017
> This existing code does not check HeapLock result, and calls HeapUnlock later. Putting a CHECK on HeapLock seems reasonable to see if it is failing for some reason. However I can't think of any reason why it would fail. Rather, the only reason I can think of it for failing are if the heap was create with the HEAP_NO_SERIALIZATION flag, but the HeapLock documentation says that that gives undefined results, not a HeapLock failure. That said, I have a theory about why it is causing crashes. The heap's main data structures are protected by the lock and are safe to access while holding the heap lock. However the heap also has a number of look-aside lists that are designed to be locklessly accessed. If HeapWalk is walking all memory that is allocated from the heap and if HeapAlloc and HeapFree are able to continue running on some paths on other threads due to these lockless lists then it seems unavoidable that HeapWalk would be either unreliable or incomplete or both. I have confirmed that alloc/free continue when the heap is locked (as long as the LFH is used) but I have not proved that this leads to HeapWalk crashes. https://twitter.com/BruceDawson0xB/status/897154347800289281 |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by dcheng@chromium.org
, Aug 3 2017