URLIndexPrivateData::RestorePrivateData consuming ~200 MB during Chrome startup |
|||||||||||
Issue description
One on Chrome user's machine URLIndexPrivateData::RestorePrivateData allocates ~200 MB of heap memory during Chrome startup. This seems like an excessive amount and needs investigation to see if there is a bug or if there needs to be caps on the retention and processing of history data.
The user who reported this reported that when opening up Chrome the browser process is "eating all the RAM". An ETW trace shows ~200 MB of memory allocated by URLIndexPrivateData::RestorePrivateData in the first twenty seconds or so. The peak total heap memory for the browser process is over 500 MB but ~220 MB is freed after startup is completed.
Since this bug is presumed to be triggered by the user's profile the user has zipped up their user data directory to ensure that we don't lose the repro.
The user shared an ETW heap trace. This was generated by starting ETW heap tracing with UIforETW, launching Chrome, and then ignoring the "Restore pages" option. While one might expect that this would result in there being one renderer process there were actually six, and 22 extension processes:
Chrome PIDs by process type:
C:\Program Files (x86)\Google\Chrome\Application\chrome.exe (27036)
browser : 27036
crashpad-handler : 27480
extension : 4784 4952 11384 11788 11792 12168 12376 12848 13220 19624 20900 23228 23800 23956 24224 26752 27988 28724 29152 29328 29372 29472
gpu-process : 22980
renderer : 15328 17404 22108 25792 28084 29232
watcher : 21528
The ETW trace is huge and is cumbersome to work with so not all details are available but it clearly shows 1.5 million outstanding allocations totaling 198 MB, all from URLIndexPrivateData::RestorePrivateData and its descendants.
I'm assigning a component and CCing a developer based on crbug.com/715852 which seems possibly relevant. In particular, do we
It's not clear what data would be useful for analyzing this but this from chrome://histograms seems probably relevant. On my machine the same histogram shows 1774 - much smaller:
Histogram: History.InMemoryURLHistoryItems recorded 1 samples, mean = 43673.0 (flags = 0x41)
0 ...
36699 ------------------------------------------------------------------------O (1 = 100.0%) {0.0%}
48336 .
,
Jul 31 2017
mpearson: I can dig into this more tomorrow, but off the top of your head is that value for History.InMemoryURLHistoryItems (43673) surprising? Is the history cache allowed to grow without bounds or is it supposed to be limited to some size? For another data point, in two actively-used profiles on the machine I'm currently on, I see 777 and 1347 for History.InMemoryURLHistoryItems.
,
Aug 1 2017
Yes, having 43k items eligible for being in the HistoryQuick provider in-memory data structure is surprising. The cache currently is allowed to grow without bounds; this hasn't generally caused problems in the past. BTW, we're experimenting with limiting how much it can grow on low-memory devices. This is more a question of whether it should take up 400k or 100k, not a question of whether it should take up 200m! I've seen this reproduced three (or so) times by users in the past. In those situations, it was always the case that a poorly-designed web site was auto-navigating, pushing new state into the history system, and then the history system was indexing it, in effect spamming the history system. I suggest looking through the user's History database (if you have permission) to try to find a common host / URL prefix. I predict there will be one. That'll help. Once found, usually I've reported these things to the navigation/history folks. Sometime it turns out they're things that shouldn't be recorded in history but inadvertently were.
,
Aug 1 2017
Shared history file in zip format via gDrive.
,
Aug 4 2017
gfm@, can you please remember to share the history file again? As I mentioned over chat, the zip file appears to be empty.
,
Aug 4 2017
Just uploaded it manually to drive, I think the first time it failed since I drag and dropped to the local folder that should have synced via gdrive.
,
Aug 4 2017
Thanks for the history file! Yeah, the history file is spammed. Aside from two particular related sites, each of which has 100+k unique URLs in the user's history. Aside from those two sites, the user's most visited site has only 4k unique entries in history. These sites account for 92% of the user's unique URLs visited. [more coming in a future update after I get permission from the reporter]
,
Aug 4 2017
,
Aug 4 2017
The two related sites that seem to be spammed are * appengine.google.com * one I'll simply call project.com If I look at the 100+k unique URLs for appengine.google.com, 99.9% are of the form: https://appengine.google.com/_ah/conflogin?continue=https%3A%2F%2Fproject.com%2F&pli=1&auth=[omitted]&authuser=0 If I look at the 100+k unique URLs for project.com, 99.9% are of the form https://project.com/_ah/conflogin?state=[omitted] This makes me think there's some sort of infinite redirection happening. Indeed, if I try to go to project.com (while I'm logged into a google account), the bottom of my omnibox flashes infinitely between * waiting for accounts.google.com * waiting for project.com * waiting for appengine.google.com Also, if I look at the referrers (technically the field |from_visit|) in the history file, I see a series of redirections like this. Judging from the notifications at the bottom of the my screen when visits project.com, I think this history file is legitimately reflecting the actual history. It's just that a lot of these visits are so incredibly transient. They're getting in the in-memory URL data structure simply because they were recent. First conclusion: * the owner of project.com needs to look into its configuration. (gfm@, contact me for the real name of the site) Second conclusion: * I need to think about whether redirects or transient visits can and should be dropped from the in-memory structures.
,
Aug 4 2017
Issue 750717 has been merged into this issue.
,
Aug 4 2017
Could we make a cap on the amount of history per domain name to avoid this kind of spamming (intended or unintended)?
,
Aug 4 2017
Issue reported to our internal team that develops that site. b/64399457
,
Aug 4 2017
In addition to the history effects (visible in more places than just the omnibox), there are also network bandwidth use + power draw problems associated with infinite redirects. So ideally we solve not just by putting in safety clamps (which we probably do need, both globally and per-domain), but by stopping the cycle reasonably quickly. Normally we have redirect loop detection that prevents this kind of thing. I wonder if the right answer is to expand that protection to catch whatever is going on here.
,
Aug 5 2017
Another thing that will help with the history side of this is issue 638198, which will prevent most redirects from being entered into the history in the first place. That said, it makes sense to have a cap on the max size in addition. Regarding redirect loop detection, it sounds like a reasonable intervention to do something simple like limit the allowed number of consecutive page loads that have no user interaction and last <5 seconds to something in the 5-100 range (number TBD based off UMA). This should be pretty easy to implement on top of the logic in issue 638198 as well.
,
Aug 9 2017
re nednguyen comment #11: we're experimenting with a cap on the amount of history to load into memory. See bug 715852. It's not a per-domain cap, instead it's based on how important the page appears to be to the user based on number of times previously selected from the omnibox, last visit time, and number of visits.
,
Aug 9 2017
re ojan@ / japhet@ comment #14: do you want to file a new bug for the comment about redirect loop detection?
,
Dec 19 2017
We now have a cap of how much memory the omnibox system can use on Android (per bug 715852). But the root cause of this issue still remains: the history system can be spammed. I am not the right person to own this part of this bug. Assigning to ojan@, as he's clearly aware of this area (per his comment #14).
,
Dec 19 2017
japhet's work to be more intelligent about not putting redirects into history is still on-going, but isn't targeted specifically at memory savings. Issue 638198 tracks that work and I think we can otherwise ignore it here. The question is whether there should be other caps, e.g. total number of history entries per tab. That's more of a browser product question than a platform one, so passing the ownership hot potato to erikchen.
,
Dec 20 2017
I was just chatting with mpearson and pkasting about omnibox performance. This seems related. 1) We need to prevent history from being spammed. Is there a tracking bug for this? This should be high priority. 2) Given that some users have spammed histories, or otherwise pathological history DBs, we probably need some type of workaround [at least temporarily] for the HistoryQuickProvider and co. to deal with excessively large data sets. mpearson - Any objections to extending the limitation from issue 715852 to desktop?
,
Dec 20 2017
Heh erikchen@, I guess you didn't overhear all my conversation with pkasting@. Part of it he tried to convince me to raise the caps on desktop of letting items get into HistoryQuick provider's index to let more items in and therefore be searchable. :-) You're asking for fewer. On Android, we set a cap of 1,000 URLs on non-low-memory devices. https://chromium-review.googlesource.com/c/chromium/src/+/762343 On Windows such a cap would affect ~15% of installs, certainly high enough that I'd want to run an eval for it. https://uma.googleplex.com/p/chrome/histograms/?endDate=20171218&dayCount=7&histograms=History.InMemoryURLHistoryItems&fixupData=true&showMax=true&filters=platform%2Ceq%2CW%2Cchannel%2Ceq%2C4%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial I'd be comfortable with a cap of 20,000 without experimentation. Can you check the value of the histogram History.InMemoryURLHistoryItems that you, sky@, and brettw@ have? If it's too low, a cap like this won't help them.
,
Dec 20 2017
Both sky and brettw are currently OOO, and I don't have access to their history dbs, I just ping them with specific questions.
,
Jan 8 2018
Any updates here?
,
Jan 8 2018
+sky, brettw: could you get these stats from your respective history DBs [the largest one, since I assume you have multiple profiles]?
,
Jan 9 2018
Brett is a better person to answer these questions.
,
May 8 2018
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by brucedaw...@chromium.org
, Jul 31 201784.8 KB
84.8 KB View Download
72.6 KB
72.6 KB View Download