Issue metadata
Sign in to add a comment
|
128kb regression in resource_sizes (MonochromePublic.apk) at 525174:525175 |
||||||||||||||||||||
Issue descriptionCaused by “Add GeoLanguageProvider in chrome browser main" Commit: 2dce5eb6bcd1e42835172bcde45251ea9f3fe265 Link to size graph: https://chromeperf.appspot.com/report?sid=a097e74b1aa288511afb4cb616efe0f95ba4d347ad61d5e835072f23450938ba&num_points=10&rev=480214 Debugging size regressions is documented at: https://chromium.googlesource.com/chromium/src/+/master/docs/speed/apk_size_regressions.md#Debugging-Apk-Size-Increase It looks like this increase was probably unexpected or might be avoidable. Please have a look and either: Close as “Won't Fix” with a short justification, or Land a revert / fix-up. Mainly caused by a 123kb symbol: + 0) 123270 (95.0%) t@0x15321dc 123270 (0->123268) components/language/content/browser/language_code_locator_helper.cc language::internal::GenerateDistrictLanguageMapping
,
Dec 20 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8959673499450660432
,
Dec 20 2017
=== BISECT JOB RESULTS ===
NO Perf regression found, tests failed to produce values
Bisect Details
Configuration: android_nexus7_perf_bisect
Benchmark : resource_sizes
Metric : MonochromePublic.apk_Specifics/normalized apk size
To Run This Test
src/build/android/resource_sizes.py --chromium-output-directory {CHROMIUM_OUTPUT_DIR} --chartjson {CHROMIUM_OUTPUT_DIR}/apks/MonochromePublic.apk
More information on addressing performance regressions:
http://g.co/ChromePerformanceRegressions
Debug information about this bisect:
https://chromeperf.appspot.com/buildbucket_job_status/8959673499450660432
For feedback, file a bug with component Speed>Bisection
,
Dec 21 2017
Hey Renjie, Is there anything actionable to do here? GenerateDistrictLanguageMapping() is generated code; it looks like maybe it's a whole lot o' code?
,
Jan 10 2018
Ping renjieliu - please take a look.
,
Jan 10 2018
Sorry, I missed this thread :( I think the 128kb is expected, cause we're generate s2cellid-> language code pair based on this file: https://cs.chromium.org/chromium/src/components/language/content/browser/adminregionsdata.csv?sq=package:chromium&dr du -h components/language/content/browser/adminregionsdata.csv gave me 76kB, also includes linking the s2cellId library(https://cs.chromium.org/chromium/src/third_party/s2cellid/?q=third_party/s2cellid&sq=package:chromium&dr) do we worry about the size?
,
Jan 10 2018
,
Jan 11 2018
128kb is quite large (we alert on 16kb jumps, and only get ~3 alerts per day). From the linked design doc, we already send a network request to a google service to map public IP -> geolocation. Could we also use a network request to fetch the language codes rather than bundling them in?
,
Jan 11 2018
We had a discussion with the privacy team, they are fine to send public IP to get geolocation, but they're strongly against sending geolocation to the server, that's why we decided to bundle the language code in the binary. +msramek
,
Jan 11 2018
128k does sound pretty large. Is the s2cellid-> language code data efficiently stored? Could it be quantized more?
,
Jan 11 2018
Ah, so it uses finer-grained location when resolving the locales? That's certainly a fair argument then. Looking at the generated code: https://cs.chromium.org/chromium/src/out/Debug/gen/components/language/content/browser/language_code_locator_helper.cc It looks like it should be possible to shrink it significantly since it's highly repetitive. E.g. * Store the locales separately as a list of (locales, entry count) * Delta-encode the int64s, encoded as LEB128. Alternatively, store the data as a PAK entry and gzip it. Sorry to be asking for extra work, but for binary size reductions, we'd generally spend 1-2 weeks for a 100kb reduction.
,
Jan 11 2018
the generated code can be retrieved:
ninja -C out/Default -j 1000 components/language/content/browser:language_code_locator
and it looks like below (head -n 50 out/Default/gen/components/language/content/browser/language_code_locator_helper.cc ):
// Copyright 2017 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
#include <string>
#include <vector>
namespace language {
namespace internal {
std::vector<std::pair<uint64_t, std::string>>
GenerateDistrictLanguageMapping() {
return {
{ 0x3062a90000000000ull, "hi;bn;ta;te;ml"},
{ 0x3062ab0000000000ull, "hi;bn;ta;te;ml"},
{ 0x3062ad0000000000ull, "hi;bn;ta;te;ml"},
{ 0x30635f0000000000ull, "hi;bn;ta;te;ml"},
{ 0x3063610000000000ull, "hi;bn;ta;te;ml"},
{ 0x30642d0000000000ull, "hi;bn;ta;te;ml"},
{ 0x30642e7000000000ull, "hi;bn;ta;te;ml"},
{ 0x3064935000000000ull, "hi;bn;ta;te;ml"},
{ 0x306494b000000000ull, "hi;bn;ta;te;ml"},
{ 0x30649f0000000000ull, "hi;bn;ta;te;ml"},
{ 0x3064a10000000000ull, "hi;bn;ta;te;ml"},
{ 0x3064eb5400000000ull, "hi;bn;ta;te;ml"},
{ 0x3064ecaab0000000ull, "hi;bn;ta;te;ml"},
{ 0x306501c000000000ull, "hi;bn;ta;te;ml"},
{ 0x306507f000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065084000000000ull, "hi;bn;ta;te;ml"},
{ 0x30658d0000000000ull, "hi;bn;ta;te;ml"},
{ 0x30658e4000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065910000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065930000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065a14000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065a3c000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065a50000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065a70000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065a90000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065c10000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065c30000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065c50000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065c6b000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065d30000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065dd5554000000ull, "hi;bn;ta;te;ml"},
{ 0x3065e50000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065e70000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065ec0000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065f04000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065f30000000000ull, "hi;bn;ta;te;ml"},
{ 0x3065fab000000000ull, "hi;bn;ta;te;ml"},
...
and du gave me 164k of that generated file, since we have around 4k lines of data.
I can try to store the locales using enums, I'm not sure encoding uint64s differently can help us with saving the data storage since the value itself will be used as s2cellid.
,
Jan 11 2018
Hi Andrew, I wonder if there's any guidance about run Android resource size tests?
,
Jan 11 2018
Pls apply appropriate OSs label.
,
Jan 11 2018
You'll want to consider the suggestions in c#11, try some, and then run " tools/binary_size/diagnose_bloat.py" to see the difference. Here is some more docs for diagnose_bloat.py: https://chromium.googlesource.com/chromium/src/+/master/tools/binary_size/README.md#diagnose_bloat_py If you want to really dig into it, the doc also continues with the "supersize" tool.
,
Jan 11 2018
Again to reiterate the importance of this work, consider that taking as long as a week or more to reduce this regression from 128kb to 28kb means about 100kb of bandwidth that our billion users won't need to download. That's about 100 terabytes of data that our users won't have to pay for. It's a very significant amount of savings especially for NBU where every byte of data has significant cost. Thank you for working to reduce this regression!
,
Jan 11 2018
You can still have the function return a vector<std::pair<uint64_t, std::string>>. I'm just suggesting that you initialize it differently. E.g. Have the build rule produces a gzip stream of (uint64, enum) pairs, then in the getter, unzip and convert to a vector<pair>.
,
Jan 11 2018
Thanks Peter for providing the guidance, and thanks Andrew for the suggestion! I will try the following approach to reduce the since: 1) store enum instead of strings 2) since most unint64s are ended with 0000... I will store them using uint32s and padding them back to uint64s in the lookup. (although lose accuracies a little bit...)
,
Jan 17 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a commit 4a79a59e0ea307e93d94b25a35bc7b708c86dc5a Author: Renjie Liu <renjieliu@chromium.org> Date: Wed Jan 17 02:34:40 2018 Refactor language_code_locator.cc to reduce binary size. 1) store enum instead of strings 2) store uint64s instead of uint32s (lose a little bit accuracy) Add an additional test in India. Bug: 796578 Change-Id: I5f8e858353769aa8840642dbc1059f5f63b13ef0 Reviewed-on: https://chromium-review.googlesource.com/863262 Commit-Queue: Renjie Liu <renjieliu@chromium.org> Reviewed-by: agrieve <agrieve@chromium.org> Cr-Commit-Position: refs/heads/master@{#529550} [modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/BUILD.gn [modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/convert_s2_cell.py [modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/language_code_locator.cc [modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/language_code_locator.h [modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/language_code_locator_unittest.cc [delete] https://crrev.com/b542a405daf4fcee68a1799ff0dc99b056f3f645/components/language/content/browser/template/language_code_locator_helper.cc.tmpl [add] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/template/language_code_locator_helper.h.tmpl
,
Jan 17 2018
The fix reduced size by 110kb! Overall cost of this feature is now just 18kb (about the size of the uint32s for the locations). https://chromeperf.appspot.com/report?sid=a097e74b1aa288511afb4cb616efe0f95ba4d347ad61d5e835072f23450938ba&num_points=10&rev=529550
,
Jan 17 2018
Awesome! thanks for the help!
,
Jan 17 2018
Good job, that's 110 terabytes of bandwidth saved, Renjie!
,
Jan 17 2018
Andrew and Renjie, this was great work :). |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Dec 20 2017