The top level Prefs file contains blobs from UMA |
|||
Issue description
Extract from an investigation going in Issue 697122:
The prefs file is rewritten on many occasions in the pref_service [1], the entire contents of the json is written to a temporary file and then renamed to "Local State". It is ~4K in my test app (not logged in).
On my personal phone the file is 123K. The ~50K of that is the thing called "variations_compressed_seed", another ~64K is in "ongoing_logs_list" (some base64-ed blobs).
During my half-minute session of using Chrome for Android, this file was rewritten 5 times, which looks like >500K worth of unnecessary rewrites.
Ways to mitigate:
1. separate to a different file that UMA logs/stats would update on a different schedule than the global prefs
2. mmap-ed region for this data - written by the system only when necessary, guaranteed to be persistent even if the process dies, not guaranteed to be consistent :)
3. compress the whatever choice of the file - a binary format would be 2x smaller
4. reduce the dataset?
[1] codesearch for callsites into PrefService::CommitPendingWrite():
https://codesearch.chromium.org/chromium/src/components/prefs/pref_service.cc?type=cs&q=PrefService::CommitPendingWrite&l=106
,
Feb 28 2017
+holte who's touching this code as part of UKM +gab who's looked at local state stuff before I agree that things can be improved here. variations_compressed_seed is the list of experiment configurations. It doesn't change very often - but of course would be re-written if the whole file is. I do think it would be good to move it out to its own file, but among the things on our plate, it's never been particularly high priority. For example, the time to load the local state file entirely is just a tiny part of Chrome startup time - about 10ms mean time during startup on Windows: https://uma.googleplex.com/callstacks?sid=5f34c4bfd1399f077612af3eee0ff50d ongoing_logs_list is the pending UMA, Rappor and UKM logs that haven't been sent up yet. These will change more often than variations_compressed_seed, but also not super often. Again, it's something that could be moved out, but hasn't been high priority to fix. The thing is - the frequency of the file writes depends on other prefs changing - and yeah if we move the big prefs out, the total size written will be reduced. But it's not clear what the impact of this would be on Chrome performance - I've not seen it showing up anywhere as a big performance impact. The downsides of moving things out is it introduces more failure modes to Chrome. New files may be intercepted or blocked by A/Vs and there are more edge cases to worry about (vs. just the existing failure cases of prefs being corrupt). Additionally, if they're in separate files, it might slow things down on startup (i.e. more I/O since the data is not all together). Although it might not be significant. To summarize my thoughts - I think in the end it would be better if things were moved out, but I'm not convinced it's worth the effort given other more impactful work - but if someone wanted to take a stab at it, I'd be happy to review the changes.
,
Feb 28 2017
Of course, all of the above is based on my current understanding of the impact of this stuff. If we have more stats about this that demonstrate this is a bigger problem than I think it is, happy to hear them!
,
Feb 28 2017
Added a note on why we care about this in: https://bugs.chromium.org/p/chromium/issues/detail?id=697122#c7.
,
Mar 1 2017
I agree with the concerns about startup cost and increased failure modes. If we move to separate files, the startup cost of it on Android would be less than on Windows, also on Android we can probably avoid flushing the file with fsync because .. I guess we can assume that users don't pull the batteries out so often as to bias UMA/UKM with stale versions of the file. Since it is important to us on Android, I can take a stab at it. I think holte@ would do it faster, and arguably the burning disk situation is more important than UKM, but again, I am OK with both ways.
,
Mar 1 2017
Separately, I've been wondering whether it wouldn't make sense for Clank to use Java prefs for everything? (i.e. have the prefsstore C++ interfaces call into Java to save prefs) Now, I don't know how Java prefs work - in terms of what their story for IO is - but it might help with this if the strategy is to not re-write the whole set of them whenever one of them changes. But the other big (longer term) benefit is it would allow a lot of Clank code that synchronizes prefs from Java to C++ and back be simplified - since all prefs would be available to Java even before native is initialized. Thoughts? Do we know how Java prefs are handled w.r.t. IO on Android?
,
Mar 1 2017
java prefs is a valid disk-based storage that we use in quite a few cases. It is backed by xml files, and you can choose how much data fits in a single file. I *think* they are updated in roughly the same way as base::ImportantFileWriter does it, but I'd need to doublecheck. There are some threading subtleties to avoid strictmode violations, but they are manageable. One concern with using them is that this would introduce an android-only codepath, while the same mechanisms should be available on all platforms. I don't see any other obvious blockers for using android shared prefs for a specific portion of the UMA data. The fact that UMA stuff would be available early in the java code looks nice.
,
Mar 1 2017
The implementation is here: https://github.com/android/platform_frameworks_base/blob/master/core/java/android/app/SharedPreferencesImpl.java From what I can tell this is not the storage backend you're looking for. At every update, it rewrites the whole file. And all the data is held into memory at all times.
,
Mar 2 2017
re #8: (we discussed it offline with lizeb@) shared prefs still sound like an OK backend (to me) to write the state of experiments (in memory any way, written rarely, lots of random bits). Less sure about unsent logs - may be OK since histograms are updated by mutating a lot of bits in the blob at hard-to-predict offsets, but afair the logs may need to be appended instead of rewritten, and I don't know how often this happens and how large the datasets there are. asvitkine: how about a VC for and overview of requirements and AIs to assign for design/impl?
,
Mar 2 2017
Sure, happy to VC - feel free to put something on my calendar.
,
Mar 2 2017
OK, I sent an invite for next Tuesday, since Monday seems packed for us as a union.
,
Aug 30 2017
The newest Android support library (26) adds an API to implement our own prefs storage[1]. That's probably a good opportunity to look into making it binary or database backed rather than XML? [1]: https://developer.android.com/reference/android/preference/PreferenceDataStore.html |
|||
►
Sign in to add a comment |
|||
Comment 1 by pasko@chromium.org
, Feb 28 2017