New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 750845 link

Starred by 7 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug


Participants' hotlists:
Omnibox-Bugs-on-mpearson-Radar

Show other hotlists

Other hotlists containing this issue:
Hotlist-2


Sign in to add a comment

URLIndexPrivateData::RestorePrivateData consuming ~200 MB during Chrome startup

Project Member Reported by brucedaw...@chromium.org, Jul 31 2017

Issue description

One on Chrome user's machine URLIndexPrivateData::RestorePrivateData allocates ~200 MB of heap memory during Chrome startup. This seems like an excessive amount and needs investigation to see if there is a bug or if there needs to be caps on the retention and processing of history data.

The user who reported this reported that when opening up Chrome the browser process is "eating all the RAM". An ETW trace shows ~200 MB of memory allocated by URLIndexPrivateData::RestorePrivateData in the first twenty seconds or so. The peak total heap memory for the browser process is over 500 MB but ~220 MB is freed after startup is completed.

Since this bug is presumed to be triggered by the user's profile the user has zipped up their user data directory to ensure that we don't lose the repro.

The user shared an ETW heap trace. This was generated by starting ETW heap tracing with UIforETW, launching Chrome, and then ignoring the "Restore pages" option. While one might expect that this would result in there being one renderer process there were actually six, and 22 extension processes:

Chrome PIDs by process type:
C:\Program Files (x86)\Google\Chrome\Application\chrome.exe (27036)
    browser     : 27036 
    crashpad-handler : 27480 
    extension   : 4784 4952 11384 11788 11792 12168 12376 12848 13220 19624 20900 23228 23800 23956 24224 26752 27988 28724 29152 29328 29372 29472 
    gpu-process : 22980 
    renderer    : 15328 17404 22108 25792 28084 29232 
    watcher     : 21528 

The ETW trace is huge and is cumbersome to work with so not all details are available but it clearly shows 1.5 million outstanding allocations totaling 198 MB, all from URLIndexPrivateData::RestorePrivateData and its descendants.

I'm assigning a component and CCing a developer based on crbug.com/715852 which seems possibly relevant. In particular, do we 


It's not clear what data would be useful for analyzing this but this from chrome://histograms seems probably relevant. On my machine the same histogram shows 1774 - much smaller:

   Histogram: History.InMemoryURLHistoryItems recorded 1 samples, mean = 43673.0 (flags = 0x41)
0 ... 
36699 ------------------------------------------------------------------------O (1 = 100.0%) {0.0%}
48336 .
 
Labels: Performance-Memory
I've attached a few screenshots of WPA. One of them shows outstanding bytes after Chrome has finished loading - about 198 MB on the URLIndexPrivateData::RestorePrivateData path, which some of the child functions expanded to show more details.

Note also that URLIndexPrivateData::RestoreFromFile (the parent function of URLIndexPrivateData::RestorePrivateData) temporarily uses 453 MB of heap data during Chrome startup on this machine. There's an extra 170 MB of (temporary) memory from google::protobuf::`anonymous namespace'::InlineParseFromArray and an extra 86 MB of (temporary) memory (in a single allocation) from base::ReadFileToStringWithMaxSize. The second screenshot shows this.

So, clearly all of this data is being read from a single large file.

That file appears to be C:\Users\gfm\AppData\Local\Google\Chrome\User Data\Default\History Provider Cache. Tthe trace shows just 67 MB of data being read from that file which means that the 86 MB of memory allocated is over allocation by ReadFileToStringWithMaxSize - typical exponential growth pattern in order to avoid O(n^2) CPU cost.

I looked for "%localappdata%\Google\Chrome\User Data\Default\History Provider Cache" on my machine and it does not exist.

OutstandingStartupBytes_URLIndexPrivateData.PNG
84.8 KB View Download
PeakStartupBytes_URLIndexPrivateData.PNG
72.6 KB View Download
Cc: mpear...@chromium.org jdonnelly@chromium.org
mpearson: I can dig into this more tomorrow, but off the top of your head is that value for History.InMemoryURLHistoryItems (43673) surprising? Is the history cache allowed to grow without bounds or is it supposed to be limited to some size?

For another data point, in two actively-used profiles on the machine I'm currently on, I see 777 and 1347 for History.InMemoryURLHistoryItems.
Yes, having 43k items eligible for being in the HistoryQuick provider in-memory data structure is surprising.

The cache currently is allowed to grow without bounds; this hasn't generally caused problems in the past.  BTW, we're experimenting with limiting how much it can grow on low-memory devices.  This is more a question of whether it should take up 400k or 100k, not a question of whether it should take up 200m!

I've seen this reproduced three (or so) times by users in the past.  In those situations, it was always the case that a poorly-designed web site was auto-navigating, pushing new state into the history system, and then the history system was indexing it, in effect spamming the history system.  I suggest looking through the user's History database (if you have permission) to try to find a common host / URL prefix.  I predict there will be one.  That'll help.

Once found, usually I've reported these things to the navigation/history folks.  Sometime it turns out they're things that shouldn't be recorded in history but inadvertently were.

Comment 4 by g...@google.com, Aug 1 2017

Cc: g...@google.com
Shared history file in zip format via gDrive.
gfm@, can you please remember to share the history file again?  As I mentioned over chat, the zip file appears to be empty.

Comment 6 by g...@google.com, Aug 4 2017

Just uploaded it manually to drive, I think the first time it failed since I drag and dropped to the local folder that should have synced via gdrive.
Thanks for the history file!

Yeah, the history file is spammed.

Aside from two particular related sites, each of which has 100+k unique URLs in the user's history.  Aside from those two sites, the user's most visited site has only 4k unique entries in history.  These sites account for 92% of the user's unique URLs visited.

[more coming in a future update after I get permission from the reporter]
Owner: mpear...@chromium.org
Status: Assigned (was: Untriaged)
The two related sites that seem to be spammed are
* appengine.google.com
* one I'll simply call project.com

If I look at the 100+k unique URLs for appengine.google.com, 99.9% are of the form:
https://appengine.google.com/_ah/conflogin?continue=https%3A%2F%2Fproject.com%2F&pli=1&auth=[omitted]&authuser=0

If I look at the 100+k unique URLs for project.com, 99.9% are of the form
https://project.com/_ah/conflogin?state=[omitted]

This makes me think there's some sort of infinite redirection happening.

Indeed, if I try to go to project.com (while I'm logged into a google account), the bottom of my omnibox flashes infinitely between
* waiting for accounts.google.com
* waiting for project.com
* waiting for appengine.google.com

Also, if I look at the referrers (technically the field |from_visit|) in the history file, I see a series of redirections like this.

Judging from the notifications at the bottom of the my screen when visits project.com, I think this history file is legitimately reflecting the actual history.  It's just that a lot of these visits are so incredibly transient.  They're getting in the in-memory URL data structure simply because they were recent.

First conclusion:
* the owner of project.com needs to look into its configuration.  (gfm@, contact me for the real name of the site)

Second conclusion:
* I need to think about whether redirects or transient visits can and should be dropped from the in-memory structures.
Issue 750717 has been merged into this issue.
Cc: nedngu...@google.com
Could we make a cap on the amount of history per domain name to avoid this kind of spamming (intended or unintended)?

Comment 12 by g...@google.com, Aug 4 2017

Issue reported to our internal team that develops that site. b/64399457
In addition to the history effects (visible in more places than just the omnibox), there are also network bandwidth use + power draw problems associated with infinite redirects.

So ideally we solve not just by putting in safety clamps (which we probably do need, both globally and per-domain), but by stopping the cycle reasonably quickly.

Normally we have redirect loop detection that prevents this kind of thing.  I wonder if the right answer is to expand that protection to catch whatever is going on here.

Comment 14 by ojan@chromium.org, Aug 5 2017

Cc: japhet@chromium.org
Another thing that will help with the history side of this is issue 638198, which will prevent most redirects from being entered into the history in the first place.

That said, it makes sense to have a cap on the max size in addition.

Regarding redirect loop detection, it sounds like a reasonable intervention to do something simple like limit the allowed number of consecutive page loads that have no user interaction and last <5 seconds to something in the 5-100 range (number TBD based off UMA). This should be pretty easy to implement on top of the logic in issue 638198 as well.
re nednguyen comment #11: we're experimenting with a cap on the amount of history to load into memory.  See bug 715852.  It's not a per-domain cap, instead it's based on how important the page appears to be to the user based on number of times previously selected from the omnibox, last visit time, and number of visits.

re ojan@ / japhet@ comment #14: do you want to file a new bug for the comment about redirect loop detection?
Owner: ojan@chromium.org
We now have a cap of how much memory the omnibox system can use on Android (per bug 715852).  But the root cause of this issue still remains: the history system can be spammed.

I am not the right person to own this part of this bug.  Assigning to ojan@, as he's clearly aware of this area (per his comment #14).

Comment 18 by ojan@chromium.org, Dec 19 2017

Cc: ojan@chromium.org
Owner: erikc...@chromium.org
japhet's work to be more intelligent about not putting redirects into history is still on-going, but isn't targeted specifically at memory savings. Issue 638198 tracks that work and I think we can otherwise ignore it here.

The question is whether there should be other caps, e.g. total number of history entries per tab. That's more of a browser product question than a platform one, so passing the ownership hot potato to erikchen.
Cc: pkasting@chromium.org
I was just chatting with mpearson and pkasting about omnibox performance. This seems related.

1) We need to prevent history from being spammed. Is there a tracking bug for this? This should be high priority. 

2) Given that some users have spammed histories, or otherwise pathological history DBs, we probably need some type of workaround [at least temporarily] for the HistoryQuickProvider and co. to deal with excessively large data sets. 

mpearson - Any objections to extending the limitation from issue 715852 to desktop? 
Heh erikchen@, I guess you didn't overhear all my conversation with pkasting@.  Part of it he tried to convince me to raise the caps on desktop of letting items get into HistoryQuick provider's index to let more items in and therefore be searchable. :-)  You're asking for fewer.

On Android, we set a cap of 1,000 URLs on non-low-memory devices.
https://chromium-review.googlesource.com/c/chromium/src/+/762343

On Windows such a cap would affect ~15% of installs, certainly high enough that I'd want to run an eval for it.
https://uma.googleplex.com/p/chrome/histograms/?endDate=20171218&dayCount=7&histograms=History.InMemoryURLHistoryItems&fixupData=true&showMax=true&filters=platform%2Ceq%2CW%2Cchannel%2Ceq%2C4%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial

I'd be comfortable with a cap of 20,000 without experimentation.

Can you check the value of the histogram History.InMemoryURLHistoryItems that you, sky@, and brettw@ have?  If it's too low, a cap like this won't help them.
Both sky and brettw are currently OOO, and I don't have access to their history dbs, I just ping them with specific questions.

Comment 22 by ssid@chromium.org, Jan 8 2018

Any updates here?
Cc: brettw@chromium.org sky@chromium.org
+sky, brettw: could you get these stats from your respective history DBs [the largest one, since I assume you have multiple profiles]?

Comment 24 by sky@chromium.org, Jan 9 2018

Brett is a better person to answer these questions.

Comment 25 by ojan@chromium.org, May 8 2018

Cc: -ojan@chromium.org

Sign in to add a comment