Reduce the NTP snippets and images database size for low end mode |
||||||||||
Issue descriptionIIUC, The limit for write buffer and read buffer is set to 512 KB for each of the databases. We see that the median usage of leveldb from slow reports as 700KB. Investigate if we can cut down further on the database buffer limits for low end mode. Do we have data about which links user clicks and do we really need so many snippets in memory?
,
May 12 2017
So, the 700KB number is all databases together. So, most of the highest use cases are due to indexedDB. But, more than 50% of the cases have only ntp snippets and snippetImages databases. So, main cause of thie usage is ntp database. Could you explain a bit more why do we need such a big database for snippets? How many snippets do we currently store in the user device (in-memory + disk)? Can we limit this number itself? How about we only keep N snippets in-memory where N = number of snippets shown in the NTP. Do we have any data about how many people actually click "More" in the NTP? You can detect low-end mode using base::SysInfo::IsLowEndDevice() call. Low-end mode is switch used in Chrome to use lesser memory on devices with less than 1GB RAM to avoid OOMs and lot of swapping to disk.
,
May 15 2017
This buffer size is most likely the cause for the improvement observed in crbug.com/706941 when trybots stopped fetching snippets. Siddhartha, where do you get the numbers from? Can you point us directly to the metrics?
,
May 15 2017
Re #2, we don't have any particular reason for choosing these sizes. We could reduce them further; hitting the disk whenever suggestions or their images are updated probably isn't that bad. But I don't know if LevelDB will behave poorly if you make these sizes too small - the current values are already much smaller than the defaults. Do you have experience with that?
,
May 15 2017
+ Dmitry who has been looking into levelDB memory/performance characteristics.
,
May 15 2017
The numbers are coming from the go/slow-memory-reports dashboard. This data is collected from Dev and canary users on memory pressure to see what is the cause of OOMs. The graph with the title "Memory Usage of Component" (first of the 2) which is a histogram shows malloc value of browser by default. This can be changed to "leveldb" on the selection box on the left to see how the leveldb usage is. The first and second graphs on the page shows average and 95th percentile values and "leveldb" line is around 500KB. I'd like to hope that we do not have the ntp databases on memory when ntp is not open. I haven't seem them open. The regression on the benchmark in issue 706941 does not have ntp open. Re #4: I was suggesting maybe we could reduce the number of entries we store in the database as opposed to commuting more aggressively to disk. That is we why do we keep these entries in disk if we do not use them.
,
May 16 2017
Thanks for the dashboard link! That's super helpful. We do not explicitly page out those DBs when no NTP is open, so they could be resident even when no NTP is open. That *could* be changed with some effort, but with Chrome Home it won't really make sense anymore, since the bottom sheet always exists. Re storing fewer entries: I'm assuming that most of those 700 KB are images rather than the suggestions themselves. We'll usually have 10 images in the DB (assuming the user looked at them - they're loaded on demand). 700 KB sounds about right if they're all in memory. Of course, there's no reason to keep them all in memory, especially under memory pressure. Maybe we can configure our LevelDB to flush to disk more often?
,
May 18 2017
Um, so on a clean profile I opened NTP scrolled to see all the 10 snippets on my Nexus 5X and took a trace. The database uses 110KB in total. Snippets use 16KB and images use 100KB. Now, I tried to press More multiple times and viewed 100 snippets (the end) and the memory usage is 180KB for snippets and 530KB for images. Now I opened chrome again with ntp and the usage stays up at 500KB, without even looking at snippets. But, random samples of user data seems to have upto 500KB of images and 150KB of snippets. This smells like we are storing a lot of snippets in memory than we actually need.
,
May 19 2017
,
May 19 2017
Thanks for raising this. We might do a couple of not-so-smart things with the image database. For example, after we loaded the snippet data (after startup), we load all keys of the image database to garbage collect the ones that are not referenced anymore. I guess that this already pulls the other images into lower-level read buffers and occupies that memory. Maybe we can go a bit smarter about that? If we move data onto disk, we should also inspect our code carefully. I can imagine a few places where we optimistically look for thumbnail data in the database.
,
May 19 2017
thinking a bit more about my last comment -- missed lookups should be handled well by the leveldb; i hope it's keeping the necessary index in RAM anyways. For garbage collecting images, we already set fill_cache to false (https://codesearch.chromium.org/chromium/src/components/leveldb_proto/leveldb_database.cc?rcl=dac8e43d7961d3bd5d4db36b191aecd64e0979c9&l=154) so that sounds sane...
,
Jul 7 2017
Marking available since I'm not actively working on this.
,
Jul 31 2017
Vitalii, could you please take a look on this? (I think this is okay to fix after FF)
,
Jul 31 2017
Possibly after FF, before that I expect to be busy with BreakingNews and Prefetching (both are pre FF).
,
Jul 31 2017
Sure, I thought you are busy now, thanks! (I would handle it myself in the FF-BP period, but I am on vacation.) Feel free to re-assign later.
,
Sep 25 2017
Assigning back to jkrcal@. I still didn't get to this bug. Also I am not familiar with our database, so I am not sure what to do. I am happy to overtake back if you are terribly busy, but willing to share your thoughts on the fix.
,
Oct 25 2017
I did not get to this bug, so far. I am sorry about it.
,
Nov 22 2017
ssid@ can you please point me to some docs how to measure how much the database is using? (the method you used for #10)
,
Nov 27 2017
Docs on measuring memory use: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/README.md. Once you get a trace, you can click on leveldb usage to see it broken down by database.
,
Nov 30 2017
Thanks! I've spent some time experimenting with tracing the memory consumption of the database and do not see anything suspicious in our code. Ad #8: I can reconfirm this observation. What is weird the memory use stays high after startup even if I switch off garbage collection. The only call we then make into the database at startup is leveldb_proto::ProtoDatabase<SnippetImageProto>::Init(). I am not sure if the database somehow prefills its cache. I see no option to influence the startup behavior in https://cs.chromium.org/chromium/src/third_party/leveldatabase/src/include/leveldb/options.h Ad #10 and #11: Indeed, loading all keys on startup has no clear impact on memory use. The only change that consistently has effect on memory use is the buffer size. Therefore, I've prepared a CL that decreases the buffer size for LowMemory devices to 128 KB. The only other relevant change I see is to load the images database lazily when the suggestions surface gets opened. This does not seem worth the effort because: - with Chrome Home, the surface gets opened very often on startup (always after >=3hrs of inactivity); - with Feed integration, this database should go away. Let's keep this discussion around for posterity for the Feed integration (which will need a different database, maybe we can keep it open only while the bottom sheet is opened?)
,
Nov 30 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1fa06605790c5c914f474346417f7eaa8ee23884 commit 1fa06605790c5c914f474346417f7eaa8ee23884 Author: Jan Krcal <jkrcal@chromium.org> Date: Thu Nov 30 16:29:52 2017 [Remote suggestions] Reduce database memory use for low end devices This CL changes the buffer size for leveldb database for remote suggestions. This smaller buffer size is used only for devices with memory pressure. The change saves up to 500 KB of memory. Similarly as the previous buffer size of 512 KB, the new buffer size of 128 KB was chosen arbitrarily (the largest size that still shows considerable memory savings when tracing the process). The CL also makes small changes to avoid empty calls to the database. Bug: 716170 Change-Id: Iccb398d50b95d8d1441ceb4bb731cb1d26b0fe20 Reviewed-on: https://chromium-review.googlesource.com/800271 Commit-Queue: Jan Krcal <jkrcal@chromium.org> Reviewed-by: Marc Treib <treib@chromium.org> Cr-Commit-Position: refs/heads/master@{#520567} [modify] https://crrev.com/1fa06605790c5c914f474346417f7eaa8ee23884/components/ntp_snippets/remote/remote_suggestions_database.cc [modify] https://crrev.com/1fa06605790c5c914f474346417f7eaa8ee23884/components/ntp_snippets/remote/remote_suggestions_provider_impl.cc
,
Jan 12 2018
,
Feb 26 2018
Dropping the bug, adding zea@ from SEA (new owners of components/ntp_snippets/). |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by treib@chromium.org
, Apr 28 2017