New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 796578 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit 29 days ago
Closed: Jan 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug-Regression



Sign in to add a comment

128kb regression in resource_sizes (MonochromePublic.apk) at 525174:525175

Project Member Reported by wnwen@google.com, Dec 20 2017

Issue description

Caused by “Add GeoLanguageProvider in chrome browser main"

Commit: 2dce5eb6bcd1e42835172bcde45251ea9f3fe265

Link to size graph: https://chromeperf.appspot.com/report?sid=a097e74b1aa288511afb4cb616efe0f95ba4d347ad61d5e835072f23450938ba&num_points=10&rev=480214

Debugging size regressions is documented at: https://chromium.googlesource.com/chromium/src/+/master/docs/speed/apk_size_regressions.md#Debugging-Apk-Size-Increase

It looks like this increase was probably unexpected or might be avoidable.
Please have a look and either:

Close as “Won't Fix” with a short justification, or
Land a revert / fix-up.

Mainly caused by a 123kb symbol:
+ 0)     123270 (95.0%) t@0x15321dc  123270 (0->123268) components/language/content/browser/language_code_locator_helper.cc              
               language::internal::GenerateDistrictLanguageMapping                      
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Dec 20 2017

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=796578

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=2634160c13322752e250100ace0b0b638a39256c38c41fa74c8c9b455c36b759


Bot(s) for this bug's original alert(s):

Android Builder
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Dec 20 2017


=== BISECT JOB RESULTS ===
NO Perf regression found, tests failed to produce values

Bisect Details
  Configuration: android_nexus7_perf_bisect
  Benchmark    : resource_sizes
  Metric       : MonochromePublic.apk_Specifics/normalized apk size


To Run This Test
  src/build/android/resource_sizes.py --chromium-output-directory {CHROMIUM_OUTPUT_DIR} --chartjson {CHROMIUM_OUTPUT_DIR}/apks/MonochromePublic.apk

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8959673499450660432


For feedback, file a bug with component Speed>Bisection
Hey Renjie,

Is there anything actionable to do here? GenerateDistrictLanguageMapping() is generated code; it looks like maybe it's a whole lot o' code?
Ping renjieliu - please take a look.
Cc: amoylan@chromium.org napper@chromium.org yyushkina@chromium.org
Sorry, I missed this thread :(

I think the 128kb is expected, cause we're generate s2cellid-> language code pair based on this file: https://cs.chromium.org/chromium/src/components/language/content/browser/adminregionsdata.csv?sq=package:chromium&dr

du -h components/language/content/browser/adminregionsdata.csv gave me 76kB, also includes linking the s2cellId library(https://cs.chromium.org/chromium/src/third_party/s2cellid/?q=third_party/s2cellid&sq=package:chromium&dr)

do we worry about the size?


Cc: talo@chromium.org
Labels: ReleaseBlock-Stable
128kb is quite large (we alert on 16kb jumps, and only get ~3 alerts per day).

From the linked design doc, we already send a network request to a google service to map public IP -> geolocation. Could we also use a network request to fetch the language codes rather than bundling them in?
Cc: msramek@chromium.org
We had a discussion with the privacy team, they are fine to send public IP to get geolocation, but they're strongly against sending geolocation to the server, that's why we decided to bundle the language code in the binary.

+msramek
128k does sound pretty large. Is the s2cellid-> language code data efficiently stored? Could it be quantized more?
Ah, so it uses finer-grained location when resolving the locales? That's certainly a fair argument then.

Looking at the generated code:
https://cs.chromium.org/chromium/src/out/Debug/gen/components/language/content/browser/language_code_locator_helper.cc

It looks like it should be possible to shrink it significantly since it's highly repetitive. E.g.
* Store the locales separately as a list of (locales, entry count)
* Delta-encode the int64s, encoded as LEB128.

Alternatively, store the data as a PAK entry and gzip it.

Sorry to be asking for extra work, but for binary size reductions, we'd generally spend 1-2 weeks for a 100kb reduction.



the generated code can be retrieved:

ninja -C out/Default -j 1000 components/language/content/browser:language_code_locator

and it looks like below (head -n 50 out/Default/gen/components/language/content/browser/language_code_locator_helper.cc ):
// Copyright 2017 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#include <string>
#include <vector>

namespace language {
namespace internal {
std::vector<std::pair<uint64_t, std::string>>
GenerateDistrictLanguageMapping() {
  return {
  
  { 0x3062a90000000000ull, "hi;bn;ta;te;ml"},
    { 0x3062ab0000000000ull, "hi;bn;ta;te;ml"},
    { 0x3062ad0000000000ull, "hi;bn;ta;te;ml"},
    { 0x30635f0000000000ull, "hi;bn;ta;te;ml"},
    { 0x3063610000000000ull, "hi;bn;ta;te;ml"},
    { 0x30642d0000000000ull, "hi;bn;ta;te;ml"},
    { 0x30642e7000000000ull, "hi;bn;ta;te;ml"},
    { 0x3064935000000000ull, "hi;bn;ta;te;ml"},
    { 0x306494b000000000ull, "hi;bn;ta;te;ml"},
    { 0x30649f0000000000ull, "hi;bn;ta;te;ml"},
    { 0x3064a10000000000ull, "hi;bn;ta;te;ml"},
    { 0x3064eb5400000000ull, "hi;bn;ta;te;ml"},
    { 0x3064ecaab0000000ull, "hi;bn;ta;te;ml"},
    { 0x306501c000000000ull, "hi;bn;ta;te;ml"},
    { 0x306507f000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065084000000000ull, "hi;bn;ta;te;ml"},
    { 0x30658d0000000000ull, "hi;bn;ta;te;ml"},
    { 0x30658e4000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065910000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065930000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065a14000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065a3c000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065a50000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065a70000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065a90000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065c10000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065c30000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065c50000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065c6b000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065d30000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065dd5554000000ull, "hi;bn;ta;te;ml"},
    { 0x3065e50000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065e70000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065ec0000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065f04000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065f30000000000ull, "hi;bn;ta;te;ml"},
    { 0x3065fab000000000ull, "hi;bn;ta;te;ml"},
...

and du gave me 164k of that generated file, since we have around 4k lines of data.

I can try to store the locales using enums, I'm not sure encoding uint64s differently can help us with saving the data storage since the value itself will be used as s2cellid.

Hi Andrew, I wonder if there's any guidance about run Android resource size tests?
Pls apply appropriate OSs label. 

Comment 15 by wnwen@chromium.org, Jan 11 2018

Labels: OS-Android
You'll want to consider the suggestions in c#11, try some, and then run " tools/binary_size/diagnose_bloat.py" to see the difference.

Here is some more docs for diagnose_bloat.py: https://chromium.googlesource.com/chromium/src/+/master/tools/binary_size/README.md#diagnose_bloat_py

If you want to really dig into it, the doc also continues with the "supersize" tool.

Comment 16 by wnwen@chromium.org, Jan 11 2018

Again to reiterate the importance of this work, consider that taking as long as a week or more to reduce this regression from 128kb to 28kb means about 100kb of bandwidth that our billion users won't need to download. That's about 100 terabytes of data that our users won't have to pay for.

It's a very significant amount of savings especially for NBU where every byte of data has significant cost.

Thank you for working to reduce this regression!
You can still have the function return a vector<std::pair<uint64_t, std::string>>. I'm just suggesting that you initialize it differently. E.g. Have the build rule produces a gzip stream of (uint64, enum) pairs, then in the getter, unzip and convert to a vector<pair>.
Thanks Peter for providing the guidance, and thanks Andrew for the suggestion!

I will try the following approach to reduce the since:
1) store enum instead of strings
2) since most unint64s are ended with 0000... I will store them using uint32s and padding them back to uint64s in the lookup. (although lose accuracies a little bit...) 
Project Member

Comment 19 by bugdroid1@chromium.org, Jan 17 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a

commit 4a79a59e0ea307e93d94b25a35bc7b708c86dc5a
Author: Renjie Liu <renjieliu@chromium.org>
Date: Wed Jan 17 02:34:40 2018

Refactor language_code_locator.cc to reduce binary size.

1) store enum instead of strings
2) store uint64s instead of uint32s (lose a little bit accuracy)

Add an additional test in India.

Bug:  796578 
Change-Id: I5f8e858353769aa8840642dbc1059f5f63b13ef0
Reviewed-on: https://chromium-review.googlesource.com/863262
Commit-Queue: Renjie Liu <renjieliu@chromium.org>
Reviewed-by: agrieve <agrieve@chromium.org>
Cr-Commit-Position: refs/heads/master@{#529550}
[modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/BUILD.gn
[modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/convert_s2_cell.py
[modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/language_code_locator.cc
[modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/language_code_locator.h
[modify] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/language_code_locator_unittest.cc
[delete] https://crrev.com/b542a405daf4fcee68a1799ff0dc99b056f3f645/components/language/content/browser/template/language_code_locator_helper.cc.tmpl
[add] https://crrev.com/4a79a59e0ea307e93d94b25a35bc7b708c86dc5a/components/language/content/browser/template/language_code_locator_helper.h.tmpl

Status: Fixed (was: Assigned)
The fix reduced size by 110kb! Overall cost of this feature is now just 18kb (about the size of the uint32s for the locations).

https://chromeperf.appspot.com/report?sid=a097e74b1aa288511afb4cb616efe0f95ba4d347ad61d5e835072f23450938ba&num_points=10&rev=529550
Awesome! thanks for the help!

Comment 22 by wnwen@chromium.org, Jan 17 2018

Good job, that's 110 terabytes of bandwidth saved, Renjie!
Andrew and Renjie, this was great work :).

Sign in to add a comment