New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 636212 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jul 12
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocked on:
issue 647353

Blocking:
issue 616183
issue 634965
issue 702541



Sign in to add a comment

Sanitizer binaries are huge in size

Project Member Reported by infe...@chromium.org, Aug 10 2016

Issue description

See title.

Sanitizer build archives on all platform are growing to crazy huge in size. We need to see if these could be optimized. We can create sub-bugs later, this is for tracking the meta issue.
 
e.g. Win Archive is 2.5 gb in size.

Comment 2 by thakis@chromium.org, Aug 10 2016

Spun out from https://bugs.chromium.org/p/chromium/issues/detail?id=635715#c48:

I think it broke between r408781 and r409964.

asan-win32-release-408692.zip	2016-07-29 22:06:20	1708.49MB	408692	3a796de6b43352fb756aa1c9c24a8bc5fc481553
[DIR]	asan-win32-release-408734.zip	2016-07-30 00:04:56	1707.06MB	408734	a8e4ad8678760fe8d08c24ad1788347990042b34
[DIR]	asan-win32-release-408781.zip	2016-07-30 02:01:04	1707.06MB	408781	59e7b40948030815a49af887b922b370ba048b8b
[DIR]	asan-win32-release-409964.zip	2016-08-05 04:34:26	2555.79MB	409964	3a6cbe4c1dab38c5e9094a3ce6164b902c319bf2
[DIR]	asan-win32-release-409973.zip	2016-08-05 06:22:34	2555.75MB	409973	4f70f4aa155228044960e6521266c22e9f4e5539

Comment 3 by thakis@chromium.org, Aug 10 2016

Cc: dpranke@chromium.org
Labels: -Type-Bug Proj-GN-Migration Type-Bug-Regression
r408890 is the gn switch for the lkgr win asan bots, so this is likely due to that.

First step is probably to download a smaller and a larger zip and check if the larger one has more files, or if some file that's in both is larger. (Builds are archived at http://commondatastorage.googleapis.com/chromium-browser-asan/index.html?prefix=win32-release/)

Comment 4 by kcc@chromium.org, Aug 10 2016

Two ideas: 
* was full debug info enabled somewhere? 
* 'xz' may offer a much higher compression rate than zip 
Owner: mmoroz@chromium.org
Status: Assigned (was: Untriaged)
Max, can you check c#3 and first point in c#4 ?

Comment 6 by mmoroz@chromium.org, Aug 15 2016

I see this significant difference:

r408781:
...
-rw-r----- 1 mmoroz eng  134457344 Jul 30 05:19 ipc_fuzzer.exe
-rw-r----- 1 mmoroz eng   19888128 Jul 30 05:16 ipc_fuzzer_replay.exe
-rw-r----- 1 mmoroz eng    1074688 Jul 30 05:16 ipc_message_list.exe
-rw-r----- 1 mmoroz eng   17195520 Jul 30 05:16 ipc_message_util.exe
...


r409964:
...
-rw-r----- 1 mmoroz eng  673590272 Aug  5 07:41 ipc_fuzzer.exe
-rw-r----- 1 mmoroz eng  649572352 Aug  5 07:41 ipc_fuzzer_replay.exe
-rw-r----- 1 mmoroz eng  647089664 Aug  5 07:41 ipc_message_list.exe
-rw-r----- 1 mmoroz eng  649726464 Aug  5 07:41 ipc_message_util.exe
...

Comment 7 by mmoroz@chromium.org, Aug 15 2016

Number of files in "small" archive is actually bigger than number of files in "large" one, because "small" contains the following files:

-rw-r-----  1 mmoroz eng  175384064 Jul 30 04:59 v8_shell.exe
-rw-r-----  1 mmoroz eng        822 Jul 30 04:59 v8_shell.exe.assert.manifest
-rw-r-----  1 mmoroz eng        822 Jul 30 04:59 v8_shell.exe.manifest
-rw-r-----  1 mmoroz eng        109 Jul 30 04:59 v8_shell.exe.manifest.rc
-rw-r-----  1 mmoroz eng        888 Jul 30 04:59 v8_shell.exe.manifest.res
-rw-r-----  1 mmoroz eng  142381056 Jul 30 04:59 v8_shell.exe.pdb


while the "large" has only .exe and .pdb:
-rw-r-----  1 mmoroz eng  176911360 Aug  5 07:27 v8_shell.exe
-rw-r-----  1 mmoroz eng   14094336 Aug  5 07:27 v8_shell.exe.pdb

Comment 8 by mmoroz@chromium.org, Aug 15 2016

I mean for other executable as well, v8_shell is just an example.

Comment 9 by mmoroz@chromium.org, Aug 15 2016

A few more significant differences:

r408781:
...
-rw-r----- 1 mmoroz eng    1156608 Jul 30 05:03 blink_deprecated_test_plugin.dll
-rw-r----- 1 mmoroz eng     756736 Jul 30 05:03 blink_test_plugin.dll
...
-rw-r----- 1 mmoroz eng   13796352 Jul 30 05:16 ipc_message_dump.dll
...
-rw-r----- 1 mmoroz eng     121344 Jul 30 04:59 libEGL.dll
...


r409964:
...
-rw-r----- 1 mmoroz eng   48297984 Aug  5 07:27 blink_deprecated_test_plugin.dll
-rw-r----- 1 mmoroz eng   48059904 Aug  5 07:27 blink_test_plugin.dll
...
-rw-r----- 1 mmoroz eng  649198592 Aug  5 07:41 ipc_message_dump.dll
...
-rw-r----- 1 mmoroz eng     282624 Aug  5 07:22 libEGL.dll
...
-rw-r----- 1 mmoroz eng  131174912 Aug  5 07:27 ui_library.dll
...



Could it be so that the binaries are being built with -g instead of -gline-tables-only?

Comment 11 by mmoroz@google.com, Aug 16 2016

I'll take a look into differences between those binaries in 1-2 hours.
.pdb files for blink and ipc in r409964 are also bigger than corresponding files in r408781.


Comment 13 by mmoroz@google.com, Aug 16 2016

Also there are huge differences in subdirectories.
win32_asan_different_dirs.png
25.8 KB View Download

Comment 14 by mmoroz@google.com, Aug 16 2016

Regarding c#10, honestly I don't know how to check it.

I see 879 functions in blink_test_plugin.dll from r408781 and 19975 (!!!) functions in blink_test_plugin.dll from r409964.

Comment 15 by mmoroz@google.com, Aug 16 2016

Since we suspect the gn switch as a culprit, is it possible that we link too many things while building with gn?
I suggest you compare the lists of functions in the .dll files, so that it gives you the idea of where the extra code comes from. Maybe this is dead code, and we're just missing the option that should remove it? (Another cause of binary size mismatch could be disabling of ICF, but that shouldn't affect the symbol table size).

Regarding the -gline-tables-only flag, I don't think compilation flags are present in the bot logs (at least I couldn't find them), so you'll probably need to generate the .ninja files locally and look the flags up.
The .dll files itself don't contain function names. It should be in .pdb files. I don't know a good way to parse them in Linux, so I'll have to set up build environment on my windows desktop tomorrow and see what's going on.
It may sound silly, but my windows workstation still not ready. yesterday I had to replace some hardware and re-image it. Today i'm waiting for the MSVS version required (it takes more than 1 day to obtain that).

I'll be OOO until next Wednesday.

Comment 19 by r...@chromium.org, Aug 18 2016

I think the new gn builds include a large number of object files:

$ fd obj
./clang_nacl_win64/obj
./clang_newlib_x64/obj
./clang_newlib_x86/obj
./clang_x64/obj
./glibc_x64/obj
./glibc_x86/obj
./irt_x64/obj
./irt_x86/obj
./nacl_win_as_x64/obj
./nacl_win_as_x86/obj
./newlib_pnacl/obj
./newlib_pnacl_nonsfi/obj

$ ll irt_x86/obj/url/url
total 2140
drwxr-xr-x 1 rnk 1049089      0 Aug  4 21:16 .
drwxr-xr-x 1 rnk 1049089      0 Aug  4 21:20 ..
-rw-r--r-- 1 rnk 1049089 234872 Aug  4 21:16 gurl.o
-rw-r--r-- 1 rnk 1049089 115200 Aug  4 21:16 origin.o
-rw-r--r-- 1 rnk 1049089 150364 Aug  4 21:16 scheme_host_port.o
-rw-r--r-- 1 rnk 1049089  79680 Aug  4 21:16 url_canon_etc.o
-rw-r--r-- 1 rnk 1049089  75844 Aug  4 21:16 url_canon_filesystemurl.o
-rw-r--r-- 1 rnk 1049089  77736 Aug  4 21:16 url_canon_fileurl.o
...

These files aren't present with gyp. Why aren't these object files in obj/? Where do we decide which files to include in the archived build?

Comment 20 by r...@chromium.org, Aug 18 2016

We need to update build/scripts/slave/recipe_modules/archive/api.py to exclude object file directories created by gn, which are no longer at the top-level of the build directory. It's a bit involved, because the existing filter only really works on the top level.
That's certainly possible, if you've been filtering things out in the past. The file layout for objects changed from GYP->GN, particularly with respect to NaCl objects.

Also, we know we build significantly more objects in the NaCl toolchains now than we did w/ GYP; in GYP we had separate copies of //base, //ipc, etc. in the build files that were kept minimal; in GN, we got rid of that to lower the overall build maintenance, and the cost of slightly increased cycle time and counting on the linker to strip out the unneeded code.
A side note from  bug 607627  regarding sizes of libfuzzer-based fuzzers:


Interesting, that debug builds are smaller than release ones, for example:

Release:
-rwxr-x--- 1 mmoroz eng 21014352 Aug 11 13:48 out/Release/icu_break_iterator_fuzzer

Debug:
-rwxr-x--- 1 mmoroz eng  2974824 Aug 11 13:59 out/Release/icu_break_iterator_fuzzer

Comment 23 by r...@chromium.org, Aug 25 2016

Cc: machenb...@chromium.org
+machenbach, who looks like the best owner for archive/api.py

Comment 24 by r...@chromium.org, Sep 7 2016

Is it still important that we filter object files out of the archived clusterfuzz builds? Do we need to reassign?
Blockedon: 647353
Filed  issue 647353  for the api.py problem. It sounds like there's more than that here -- mmoroz, updates to comment 18?
Well, I don't think that the problem caused by debug symbols. Debug symbols should be in .pdb file (if we don't use "/Z7" flag - I believe we don't):

r408781:
129M ipc_fuzzer.exe
100M ipc_fuzzer.exe.pdb

r409964:
643M ipc_fuzzer.exe
 59M ipc_fuzzer.exe.pdb

which is even smaller after the gn switch.

Unfortunately, I haven't found a convenient way to compare contents of two .pdb files. Not sure if anybody on the Earth know how to parse/analyze it :(


I've compiled ipc_fuzzer.exe locally with the same configuration as the build bot (https://build.chromium.org/p/chromium.lkgr/builders/Win%20ASan%20Release/builds/2922/steps/generate_build_files/logs/stdio):
enable_ipc_fuzzer = true
is_asan = true
is_clang = true
is_component_build = false
is_debug = false
target_cpu = "x86"
use_goma = false
v8_enable_verify_heap = true

and got other sizes:
09/16/2016  02:14 PM       269,165,056 ipc_fuzzer.exe
09/16/2016  02:14 PM       421,990,400 ipc_fuzzer.exe.pdb


I've tried to compare exact options passed to the compiler, but looks like we don't have logs from the GYP time: https://build.chromium.org/p/chromium.lkgr/builders/Win%20ASan%20Release/builds/2669/steps/generate_build_files


I'm trying one more way to compare the symbols, will post an update soon.

Cc: zturner@chromium.org
We have a pretty good understanding of pdbs in llvm land. zturner, do we have any tools to dump contents of pdb files which are useful for understanding why one pdb is larger than another one?
I've opened both binaries in IDA Pro and loaded PDBs as well. Extracted symbols  attached.

Roughly looks the same... Probably that confirms that debug symbols are not the issue.
408781_symbols.txt
4.1 MB View Download
409964_symbols.txt
4.0 MB View Download
Though there are many symbols, not sure about my statement that they are roughly the same.
I'm going to take a wild guess that one has a larger internal block size.
If you can make the pdbs available I'll take a look later this afternoon
All builds are here: http://commondatastorage.googleapis.com/chromium-browser-asan/index.html?prefix=win32-release/

I've looked into 408781 and 409964. Thanks!

Comment 33 by zturner@google.com, Sep 16 2016

I'm taking a look at the two builds you suggest.  There's some interesting differences.  The biggest is that the big pdb has a stream of "Fixup data" that is 86MB.  This stream is completely non-existant in the small PDB.  The difference in size between the two is obviously less than 86MB, but there are some places in the small PDB where the streams are actually bigger than the big PDB, and there are also MORE streams in the small PDB.

Anyway, this fixup data is is almost certainly the culprit since it accounts for 80% of the entire file size of the big pdb.

Fixup data seems to be related to instrumented code.  For example, the `XFIXUP_DATA` structure in Microsoft's cvinfo.h header is defined like this:

typedef struct tagXFIXUP_DATA {
   unsigned short wType;
   unsigned short wExtra;
   unsigned long rva;
   unsigned long rvaTarget;
} XFIXUP_DATA;

The `rva` and `rvaTarget` fields seem to indicate that it's mapping an address from one binary's address space to another, which is something you might do after instrumenting some code.  

I don't know *why* 408781 contains fixup data and 409964 doesn't though.

Anyway, maybe this is a red herring.  409964 is the BIG build and 408781 is the SMALL build right?  

Yet the sizes of EXEs and PDBs seem to tell a different story:

(Don't shoot me for using Powershell)

PS D:\bigpdb\408781> Get-ChildItem *.pdb,*.exe | Measure-
Object -property length -sum


Count    : 63
Average  :
Sum      : 5635425792
Maximum  :
Minimum  :
Property : Length

PS D:\bigpdb\409964> Get-ChildItem *.pdb,*.exe | Measure-Object -property length -sum


Count    : 65
Average  :
Sum      : 5362196992
Maximum  :
Minimum  :
Property : Length

So your size discrepancy is coming from somewhere else.

Comment 34 by zturner@google.com, Sep 16 2016

Ahh, I forgot to include DLLs.  The size is coming from DLLs.

PS D:\bigpdb\409964 (big)> Get-ChildItem *.dll | Measure-Object -property length -sum


Count    : 59
Average  :
Sum      : 3145553408
Maximum  :
Minimum  :
Property : Length


PS D:\bigpdb\408781 (small)> Get-ChildItem *.dll | Measure-Object -property length -sum


Count    : 56
Average  :
Sum      : 2180530688
Maximum  :
Minimum  :
Property : Length


So 409964 has 1GB extra of DLLs, while the sum of PDB and EXE is roughly the same.

Comment 35 by zturner@google.com, Sep 16 2016

A ton of it seems to be coming from ipc_message_dump.dll.  It's 633MB in 409964, and 13MB in 408781  Total difference is 500MB, which accounts for ~25% of the entire difference between the two builds.

Comment 36 by zturner@google.com, Sep 16 2016

Also you've got 1.2GB coming from these irtx86 and irtx64 directories, which aren't present in the smaller builds.  I think that accounts for roughly 90% of the increased size.

ipc_message_dump.dll - 600 / 2000 = 30%
irt* folders - 1200 / 2000 = 60%

Comment 37 by r...@chromium.org, Sep 16 2016

I'm pretty sure irt* is full of object files, aka  issue 647353 .
Cc: brucedaw...@chromium.org
Bruce, do comments 33 to 35 speak to you? Any idea why this might've changed? Possibly /INCREMENTAL?
13 MB to 633 MB (or is that 133 MB to 633 MB). Either way, huge jump.

The main cause of differences in .dll and .exe sizes from the gyp to gn transition usually turns out to be source sets. I'm currently working on figuring out which ones are causing gn's chrome.dll to be larger than gyp's, and ipc_message_dump.dll could be hitting the same problem.

Unfortunately there are hundreds of source sets, some have to stay that way, and it's hard to tell which ones you need to change to fix a particular size problem.

Another possibility is that /opt:icf or /opt:ref could be disabled. These tell the linker to discard redundant and unreachable blocks of code/data which can make an enormous difference to the size of binaries.

These two work together. The cost of linking in a lot of source sets is reduced if /opt:ref is used. I believe that /opt:ref is normally only on for official builds. I would definitely experiment with that.
Components: Build
Labels: -Proj-GN-Migration
Clearing the Proj-GN-Migration label since it didn't block the GN migration (I'm trying to figure out what, if any GYP/GN-related tasks might be left).
Owner: brucedaw...@chromium.org
Assigning to Bruce as per c#39.
Blocking: 702541
Cc: mmoroz@chromium.org
Labels: -Pri-3 Pri-2
I think I happened to run a possible cause for this just now. I just learned that `-glinetables-only -gsplit-dwarf` has the same effect as `-g2 -gsplit-dwarf` (since http://llvm.org/viewvc/llvm-project?view=revision&revision=279687). If the order of the flags is the other way round, that's not the case. I'm guessing that the order of these two flags happened to flip during the gn switch.

The fix is to not pass -gsplit-dwarf in config("minimal_symbols"). I'll give that a try.
Ah no, we only pass -gsplit-dwarf in debug builds, and the asan bots are all release builds. Ah well. At least I'll make the debug compile bots faster.
brucedawson@, do you have any plan / ETA on this?
Status: Fixed (was: Assigned)
Apologies for not investigating this lately. I just grabbed the data (archive size versus commit position) from http://commondatastorage.googleapis.com/chromium-browser-asan/index.html?prefix=win32-release/ and graphed it and then grabbed key points:

Commit
 Pos    Size(MB)
408781	1707.1
409964	2555.8 - initial regression
...
433191	2958.3 - worst case
...
434178	2423.6
434216	1704.3 - big improvement!
...
443500	1724.9
443512	1105.9 - even more improvement
...
523446	1450.9 - gradual increase

The last data point is from 2017-12-12 - I don't know why they stop at this point.

For http://commondatastorage.googleapis.com/chromium-browser-asan/index.html?prefix=linux-debug/ I doni't see any sudden increases, just gradual growth over time.

For http://commondatastorage.googleapis.com/chromium-browser-asan/index.html?prefix=linux-debug-v8-arm/ there is effectively no size growth over the last four years.

So, the original issue has been solved. I'm not sure when, but either some gn cleanup or packaging fixes or compiler fixes have done the job. I'm going to close as fixed.

We use the 64-bit builds on windows now, that are archived in a different sub-dir

http://commondatastorage.googleapis.com/chromium-browser-asan/index.html?prefix=win32-release_x64/

Sign in to add a comment