New issue
Advanced search Search tips

Issue 746761 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: Oct 2017
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

supersize: Figure out how to break down ** merge strings

Project Member Reported by agrieve@chromium.org, Jul 20 2017

Issue description

As of now, ** merge strings is about 3MB within the binary.

These come from inline string literals (e.g. SomeFunc("msg")).

The easiest way to improve attribution is by converting strings to proper symbols by assigning each literal to a variable. While this is obviously unreasonable, it's probably doable at least for generated code.

Beyond that, I'm hopeful that we could improve our attribution by using the per-object-file nm output (or perhaps some other per-object-file command is needed?). Looking per-file would cause us to over count identical strings, but I'd guess the number of strings that actually get merged when linked to be < 5% (a guess, but certainly we'll see)
 
Interestingly, it looks like strings do not get symbol names when defined as 

"const char* FOO", but do as "const char FOO[]".

Example: https://godbolt.org/g/8KS7Z6

#include <iostream>

const char foo[] = "asdf";
const char* const bar = "asdf2";

int square(int num) {
    std::cout << foo << bar;
}



.LC0:
        .string "asdf2"
square(int):
        sub     rsp, 8
        mov     esi, OFFSET FLAT:foo
        mov     edi, OFFSET FLAT:std::cout
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        mov     esi, OFFSET FLAT:.LC0
        mov     rdi, rax
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        pop     rax
        ret
_GLOBAL__sub_I__Z6squarei:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:std::__ioinit
        call    std::ios_base::Init::Init()
        pop     rax
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:std::__ioinit
        mov     edi, OFFSET FLAT:std::ios_base::Init::~Init()
        jmp     __cxa_atexit
foo:
        .string "asdf"
Actually, it looks like using const char[] makes strings not participate in merging:

https://godbolt.org/g/BnUeH3

#include <iostream>

const char foo[] = "asdf";
const char* const bar = "asdf2";

const char foo2[] = "asdf";
const char* const bar2 = "asdf2";

int square(int num) {
    std::cout << foo << bar << foo2 << bar2;
}

.LC0:
        .string "asdf2"
square(int):
        sub     rsp, 8
        mov     esi, OFFSET FLAT:foo
        mov     edi, OFFSET FLAT:std::cout
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        mov     esi, OFFSET FLAT:.LC0
        mov     rdi, rax
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        mov     esi, OFFSET FLAT:foo2
        mov     rdi, rax
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        mov     esi, OFFSET FLAT:.LC0
        mov     rdi, rax
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        pop     rax
        ret
_GLOBAL__sub_I__Z6squarei:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:std::__ioinit
        call    std::ios_base::Init::Init()
        pop     rax
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:std::__ioinit
        mov     edi, OFFSET FLAT:std::ios_base::Init::~Init()
        jmp     __cxa_atexit
foo2:
        .string "asdf"
foo:
        .string "asdf"
Example commit of giving string literals names in jni_generator.py:
 https://chromium-review.googlesource.com/578217
Interestingly, it seems the same problem does not exist for merged globals. The linker map has an entry with the name .L_MergedGlobals for each object file.
Note: #1 and #2 were with GCC. Here's one with clang (but same result): https://godbolt.org/g/mhpmEg
How much of .rodata is strings?

objdump -h libmonochrome.so| grep .rodata | awk '{print "dd if=libmonochrome.so of=rodata bs=1 count=$[0x" $3 "] skip=$[0x" $6 "]"}' | bash
strings rodata > rodata.strings
ls -l rodata*
-rw-r----- 1 agrieve eng 6194582 Jul 21 22:42 rodata
-rw-r----- 1 agrieve eng 3836093 Jul 21 22:42 rodata.strings

Let's consider these two declarations:

const char foo[] = "asdf";
const char* const bar = "asdf2";

The first one requires that the compiler dedicate five bytes to the string. Any uses of the string must load the address of those five bytes which often/typically requires no data accesses until the characters are examined.

The second one requires that the compiler dedicate sizeof(void*) bytes to a pointer, plus five bytes to the string. So, the extra space required is typically four or eight bytes. And, and uses of the string must load the pointer from the data segment before later loading the actual characters. This implies an extra cache miss.

Now, in some cases the compiler can optimize case two to be as efficient as case one. For instance, if the pointer is tagged as const and if it is not tagged as extern then the compiler knows that it has internal linkage and it can potentially avoid the pointer.

However the first case is fundamentally more efficient, and relies less on the optimizer and should be preferred. If there is a belief that case two is more efficient then I would be highly skeptical. Such a claim would need to be tested across multiple compilers. In particular, the extra level of indirection that case two implies will be unavoidable on any string pointers with external linkage - that is any that are non-const or are tagged with extern.

> Actually, it looks like using const char[] makes strings not participate in merging

This is probably because each array is supposed to have its own address, and merging would prevent that, therefore merging is not allowed. There may be a compiler switch to override that - I don't know. But, I'm not sure that optimizing for string merging is ideal, unless you know that strings are duplicated.

It is interesting that non-extern const pointers seem to be handled as efficiently by some compilers as arrays. We would need to test across multiple compilers and architectures before counting on that. Even then I worry that the guidance is too complex, given that non-const pointers are less efficient.

Plan of action:

When running nm on *.o, also look for strings:
E.g.: nm -p out/Debug/components/autofill/core/browser/browser/state_names.o
00000000 r .L.str
00000008 r .L.str.1
00000066 r .L.str.10
0000000b r .L__func__.evtag_init
...

The __fun__ ones come when the code uses "__func__"

Running: readelf -S out/Debug/components/autofill/core/browser/browser/state_names.o
Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
...
[17] .rodata.str1.1    PROGBITS        00000000 00036c 0001f9 01 AMS  0   0  1
...


So, looking at the bytes within the .o at 00036c + 00000008 gives the string for .L.str.1

Note:
For .a files, readelf -S does similar to nm, and outputs for each .o within separated with a header:
File: ./base/third_party/libevent/libevent.a(event.o)
There are 149 section headers, starting at offset 0x50b4:

Note2: There can be multiple .rodata sections. E.g.:
  [295] .rodata.str1.1    PROGBITS        00000000 00290c 0005ab 01 AMS  0   0  1
  [296] .rodata.str1.8    PROGBITS        00000000 002eb8 000009 01 AMS  0   0  8


With this information, we can create a list of strings for each .o file.

There are now two approaches we could take:
1) Don't bother trying to figure out the final address within the .so. 
 * Instead, create a symbol for each string with the correct size.
 * Use a map to de-dupe identical strings so as to not double-count merged strings (by too much anyways)

2) Each subprocess will scan the .so file's **merge strings data to find their string
 * Record the actual address & for each string

In either case, symbol aliases should be created for merged strings.


I suspect #2 is feasible, so I'll attempt that first.
Project Member

Comment 9 by bugdroid1@chromium.org, Oct 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/92655b27a8dac362d27abf66ec9724ffd8a703b1

commit 92655b27a8dac362d27abf66ec9724ffd8a703b1
Author: Andrew Grieve <agrieve@chromium.org>
Date: Wed Oct 18 20:12:30 2017

Supersize: Break ** merge strings symbols into individual literals

* nm.py now uses multiprocessing.Pipe() and os.fork() rather than
  subprocess.Popen().
* nm.py now directly executable (for testing its logic on single .o/.a
  files).
* Adds "ReadStringLiterals()" console function for dumping values of
  literals.
* "supersize archive" now runs in 40s on my machine, up from 25s with
  --no-string-literals
* libmonochrome.apk.size file now 9.67mb, up from 9.30mb

Stats for libmonochrome.so:
Section .rodata: has 100.0% of 6184870 bytes accounted for from 172885 symbols. 0 bytes are unaccounted for.
* Padding accounts for 19684 bytes (0.3%)
* Contains 156446 string literals. Total size=2888602, padding=13275
* Contains 73269 aliases, mapped to 17774 unique addresses (308060 bytes)
* 550 symbols have shared ownership (285821 bytes)

Comparing .rodata symbols with & without string literals, these two
paths now show in the top 20:
15)   1879143 (30.4%) r@Group      45520 ui/gl/gl_bindings_autogen_gl.cc (count=1)
16)   1923623 (31.1%) r@Group      44480 components/autofill/core/browser/address_rewriter_rules.cc (count=2)

It works!

Bug:  746761 
Change-Id: I8aee0cafffc8f2941ea0ff7dd4e5f825b8165bc0
Reviewed-on: https://chromium-review.googlesource.com/721465
Commit-Queue: agrieve <agrieve@chromium.org>
Reviewed-by: Samuel Huang <huangs@chromium.org>
Cr-Commit-Position: refs/heads/master@{#509849}
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/archive.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/concurrent.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/concurrent_test.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/console.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/describe.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/integration_test.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/models.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/nm.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/Archive.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/Archive_Elf.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/Archive_OutputDirectory.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/Console.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/Csv.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/Diff_Basic.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/Diff_NullDiff.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/FullDescription.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/SymbolGroupMethods.golden
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/mock_output_directory/elf
[add] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/mock_output_directory/obj/base/base/page_allocator.o
[add] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/mock_output_directory/obj/third_party/icu/icuuc/ucnv_ext.o
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/mock_toolchain/mock_nm.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/mock_toolchain/mock_readelf.py
[modify] https://crrev.com/92655b27a8dac362d27abf66ec9724ffd8a703b1/tools/binary_size/libsupersize/testdata/test.map

Status: Fixed (was: Available)
Project Member

Comment 11 by bugdroid1@chromium.org, Oct 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/18bc8502b61bc45f17b1af4a2bead3e292359745

commit 18bc8502b61bc45f17b1af4a2bead3e292359745
Author: Andrew Grieve <agrieve@chromium.org>
Date: Tue Oct 24 03:22:03 2017

Supersize: Make Diff() smarter about aliases and string literals.

Before, adding a single string literal to a file would cause diffs to
say that all following literals have changed sizes. Now, it agressively
looks for symbols of the same name+size to match up, before falling back
on matching up symbols of different sizes.

This also changes the clustering of .symbols for diffs to group by path
aliases. It's useful to have path aliases ungrouped when querying by
path, but for symbol diffs, their low PSS cause inline symbols to appear
as many small symbols, rather than one symbol that just happened to be
defined in a header. Grouping by aliases of the same name fixes this for
diffs by having these symbols merged back into a single group.

Bug:  746761 , 776032 
Change-Id: Id871b5cc1e3aeb6d86d9e864088633a8d5a2f6fd
Reviewed-on: https://chromium-review.googlesource.com/731187
Commit-Queue: agrieve <agrieve@chromium.org>
Reviewed-by: Samuel Huang <huangs@chromium.org>
Cr-Commit-Position: refs/heads/master@{#511028}
[modify] https://crrev.com/18bc8502b61bc45f17b1af4a2bead3e292359745/tools/binary_size/libsupersize/console.py
[modify] https://crrev.com/18bc8502b61bc45f17b1af4a2bead3e292359745/tools/binary_size/libsupersize/describe.py
[modify] https://crrev.com/18bc8502b61bc45f17b1af4a2bead3e292359745/tools/binary_size/libsupersize/diff.py
[modify] https://crrev.com/18bc8502b61bc45f17b1af4a2bead3e292359745/tools/binary_size/libsupersize/models.py
[modify] https://crrev.com/18bc8502b61bc45f17b1af4a2bead3e292359745/tools/binary_size/libsupersize/testdata/Console.golden
[modify] https://crrev.com/18bc8502b61bc45f17b1af4a2bead3e292359745/tools/binary_size/libsupersize/testdata/Diff_Basic.golden
[modify] https://crrev.com/18bc8502b61bc45f17b1af4a2bead3e292359745/tools/binary_size/libsupersize/testdata/Diff_NullDiff.golden

Sign in to add a comment