New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 838449 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocked on:
issue 841908

Blocking:
issue 792131



Sign in to add a comment

0.6%-1.5% regression in sizes when switching from link.exe to lld

Project Member Reported by m...@chromium.org, May 1 2018

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=838449

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=a029359c0480165ced62b8dcf54a89077cbb476727ef1b10804bb760425adc31


Bot(s) for this bug's original alert(s):

win
win-32

Comment 2 by m...@chromium.org, May 3 2018

Owner: h...@chromium.org
Status: Assigned (was: Untriaged)
hans: Looks like your linker change had a large step-up in the overall binary size. PTAL, or WontFix if this was expected and is acceptable.

Comment 4 by h...@chromium.org, May 4 2018

Blocking: 792131
Summary: 0.6%-1.5% regression in sizes when switching from link.exe to lld (was: 0.6%-1.5% regression in sizes at 553771:553777)
Thanks! The linker change was reverted shortly afterwards, but we'll need to figure out what's going on before landing the change again.

Comment 5 by p...@chromium.org, May 9 2018

Owner: p...@chromium.org
Taking a look.

Comment 6 by p...@chromium.org, May 10 2018

Can reproduce locally.

/mnt/c/src/c/src/out$ cat official_link/args.gn
use_lld = false
is_debug = false
is_official_build = true
strip_absolute_paths_from_debug_symbols = true
/mnt/c/src/c/src/out$ cat official_lld/args.gn
use_lld = true
is_debug = false
is_official_build = true
strip_absolute_paths_from_debug_symbols = true
/mnt/c/src/c/src/out$ ls -l official_l*/chrome_child.dll
-rwxrwxrwx 1 root root 76966912 May  9 16:34 official_link/chrome_child.dll
-rwxrwxrwx 1 root root 77565440 May  9 16:19 official_lld/chrome_child.dll
/mnt/c/src/c/src/out$ ls -l official_l*/chrome.dll
-rwxrwxrwx 1 root root 52494848 May  9 16:25 official_link/chrome.dll
-rwxrwxrwx 1 root root 53294592 May  9 14:38 official_lld/chrome.dll
/mnt/c/src/c/src/out$ ls -l official_l*/chrome.exe
-rwxrwxrwx 1 root root 1506816 May  9 16:34 official_link/chrome.exe
-rwxrwxrwx 1 root root 1489408 May  9 16:20 official_lld/chrome.exe

Comment 7 by p...@chromium.org, May 10 2018

The main cause is that our .pdata sections are much larger.

$ size -A official_*/chrome_child.dll
official_link/chrome_child.dll  :
section        size         addr
.text      61190316   6442455040
.rdata     12262174   6503649280
.data        669184   6515912704
.pdata      2167860   6518079488
.rodata        3840   6520250368
CPADinfo         56   6520254464
prot             37   6520258560
.rsrc         52376   6520262656
.reloc       617184   6520315904
Total      76963027


official_lld/chrome_child.dll  :
section        size         addr
.text      61261010   6442455040
.rdata     12218708   6503718912
.data        668160   6515941376
.pdata      2708616   6518108160
.00cfg           32   6520819712
.gfids         2820   6520823808
.rodata        3840   6520827904
.tls             45   6520832000
CPADinfo         56   6520836096
prot             41   6520840192
.rsrc         52376   6520844288
.reloc       644236   6520897536
Total      77559940


$ size -A official_*/chrome.dll
official_link/chrome.dll  :
section        size         addr
.text      42332808   6442455040
.rdata      7526084   6484791296
.data         69632   6492319744
.pdata      1946364   6493257728
CPADinfo         56   6495207424
prot             28   6495211520
.rodata        3584   6495215616
.rsrc        125984   6495219712
.reloc       486800   6495346688
Total      52491340


official_lld/chrome.dll  :
section        size         addr
.text      42386103   6442455040
.rdata      7541052   6484844544
.data         68608   6492389376
.pdata      2652444   6493339648
.00cfg           32   6495993856
.gfids         2792   6495997952
.rodata        3584   6496002048
.tls             37   6496006144
CPADinfo         56   6496010240
prot             32   6496014336
.rsrc        125976   6496018432
.reloc       508872   6496145408
Total      53289588

Now to figure out why that is.

Comment 8 by p...@chromium.org, May 10 2018

At least one reason is that we don't ICF pdata sections when ICFing text sections.

C:\src\tmp\pdata-test>type 1.cpp
void g();

void f1() {
  g();
}

void f2() {
  g();
}

C:\src\tmp\pdata-test>type 2.cpp
void g() {}

C:\src\tmp\pdata-test>cl /c /EHsc 1.cpp 2.cpp /Gy
Microsoft (R) C/C++ Optimizing Compiler Version 19.12.25835 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

1.cpp
2.cpp
Generating Code...

C:\src\tmp\pdata-test>link /noentry /opt:icf 1.obj  2.obj /dll /map:1.map /include:?f1@@YAXXZ /include:?f2@@YAXXZ
Microsoft (R) Incremental Linker Version 14.12.25835.0
Copyright (C) Microsoft Corporation.  All rights reserved.


C:\src\tmp\pdata-test>wsl grep f[12] 1.map
 0001:00000000       ?f2@@YAXXZ                 0000000180001000 f   1.obj
 0001:00000000       ?f1@@YAXXZ                 0000000180001000 f   1.obj
 0002:0000007c       $unwind$?f2@@YAXXZ         000000018000207c     1.obj
 0002:0000007c       $unwind$?f1@@YAXXZ         000000018000207c     1.obj
 0003:00000000       $pdata$?f2@@YAXXZ          0000000180003000     1.obj
 0003:00000000       $pdata$?f1@@YAXXZ          0000000180003000     1.obj

C:\src\tmp\pdata-test>..\..\l\ra\bin\lld-link /noentry /opt:icf 1.obj  2.obj /dll /lldmap:1l.map /include:?f1@@YAXXZ /include:?f2@@YAXXZ

C:\src\tmp\pdata-test>wsl grep f[12] 1l.map
00001000 000000f2  4096 .text
00001000 00000000     0                 "void __cdecl f1(void)" (?f1@@YAXXZ)
00001000 00000000     0                 "void __cdecl f2(void)" (?f2@@YAXXZ)
00002100 00000000     0                 $unwind$?f1@@YAXXZ
00002100 00000000     0                 $unwind$?f2@@YAXXZ
00004000 00000000     0                 $pdata$?f1@@YAXXZ
0000400c 00000000     0                 $pdata$?f2@@YAXXZ

Comment 9 by p...@chromium.org, May 10 2018

Cc: ruiu@google.com r...@chromium.org
If we fix this, we should try to fix pr35337 at the same time.

Comment 10 by p...@chromium.org, May 10 2018

I have a fix in progress for pdata. With that the chrome_child.dll size looks like this:

chrome_child.dll  :
section        size         addr
.text      61261218   6442455040
.rdata     12218788   6503718912
.data        668160   6515941376
.pdata      2176392   6518108160
.00cfg           32   6520287232
.gfids         2820   6520291328
.rodata        3840   6520295424
.tls             45   6520299520
CPADinfo         56   6520303616
prot             41   6520307712
.rsrc         52376   6520311808
.reloc       644236   6520365056
Total      77028004

So we're much closer to link.exe's size. Our .text is still 70KB larger, and our .reloc is 27KB larger for some reason, but we're getting there.

Comment 12 by p...@chromium.org, May 10 2018

10KB in chrome_child.dll .text: https://reviews.llvm.org/D46673
Blockedon: 841908

Comment 14 by p...@chromium.org, May 10 2018

With the above changes, link.exe's PDB publics stream (i.e. symbol table) has about the same number of unique entries as ours:

/mnt/c/src/c/src/out$ ../../../l/ra/bin/llvm-pdbutil.exe dump -publics official_lld/chrome_child.dll.pdb | grep S_PUB32 | cut -d'`' -f2 | sort | uniq | wc -l
408814
/mnt/c/src/c/src/out$ ../../../l/ra/bin/llvm-pdbutil.exe dump -publics official_link/chrome_child.dll.pdb | grep S_PUB32 | cut -d'`' -f2 | sort | uniq | wc -l
409203

But theirs has significantly fewer unique addresses:

/mnt/c/src/c/src/out$ ../../../l/ra/bin/llvm-pdbutil.exe dump -publics official_lld/chrome_child.dll.pdb | grep 'addr =' | cut -d, -f2 | sort | uniq  | wc -l
336541
/mnt/c/src/c/src/out$ ../../../l/ra/bin/llvm-pdbutil.exe dump -publics official_link/chrome_child.dll.pdb | grep 'addr =' | cut -d, -f2 | sort | uniq  | wc -l
331031

Hypothesis: link.exe is ICFing more than we do.

Comment 15 by p...@chromium.org, May 10 2018

It looks like link.exe is ICFing vtables (i.e. sections with leader symbol names prefixed with "??_7").

Comment 16 by p...@chromium.org, May 10 2018

With a patch that enables ICF on vtables our size now looks like this:

chrome_child.dll  :
section        size         addr
.text      61188978   6442455040
.rdata     12076884   6503645184
.data        668160   6515724288
.pdata      2167284   6517891072
.00cfg           32   6520061952
.gfids         2820   6520066048
.rodata        3840   6520070144
.tls             45   6520074240
CPADinfo         56   6520078336
prot             41   6520082432
.rsrc         52376   6520086528
.reloc       611260   6520139776
Total      76771776

So we're now 200KB smaller than link.exe.

Comment 17 by ruiu@google.com, May 10 2018

That we produce smaller executable than link.exe might be a signal that we are doing something more aggressive than link.exe, though as long as all tests pass, I'm not too worried.

Comment 18 by p...@chromium.org, May 10 2018

The smaller rodata is most likely due to string tail merging (i.e. issue 810154).

Comment 19 by ruiu@google.com, May 10 2018

Ah, that makes sense. Nice.

Comment 20 by p...@chromium.org, May 10 2018

ICF on vtables: https://reviews.llvm.org/D46734
Awesome progress.

If you still have your build directories around, how does mini_installer size compare now? (Since that's the bits we ship to users for the initial install, that's probably the size we care about most. mini_installer.exe mostly contains of a 7zip-compressed resource containing setup.exe)

Comment 22 by p...@chromium.org, May 11 2018

Sizes:

 Directory of C:\src\c\src\out\official_lld

05/11/2018  12:26 PM        46,179,840 mini_installer.exe

 Directory of C:\src\c\src\out\official_link

05/11/2018  12:16 PM        45,998,080 mini_installer.exe

So our mini_installer is still around 180KB larger. That's strange because all of chrome.dll/chrome_child.dll/chrome.exe/setup.exe are smaller uncompressed. Are there any other files that I should be looking at?

Comment 23 by p...@chromium.org, May 11 2018

The full list of files seems to be in chrome.7z, and almost all of them are smaller uncompressed (one is larger by a few bytes). The totals are:

/mnt/c/src/c/src/out/official_link$ ls -l chrome.7z
-rwxrwxrwx 1 root root 188779025 May 11 12:14 chrome.7z
/mnt/c/src/c/src/out/official_lld$ ls -l chrome.7z
-rwxrwxrwx 1 root root 188378273 May 11 12:54 chrome.7z

Here's what they look like after compression:

/mnt/c/src/c/src/out/official_link$ ls -l chrome.packed.7z
-rwxrwxrwx 1 root root 45108820 May 11 12:15 chrome.packed.7z
/mnt/c/src/c/src/out/official_lld$ ls -l chrome.packed.7z
-rwxrwxrwx 1 root root 45305317 May 11 12:55 chrome.packed.7z

So it does indeed seem to be a compression issue. I compressed chrome_child.dll on its own using the same flags as the script that created chrome.packed.7z, and the compressed file was 80KB larger.

/mnt/c/src/c/src/out/official_lld$ 7z a -t7z  -m0=BCJ2 -m1=LZMA:d27:fb128 -m2=LZMA:d22:fb128:mf=bt2 -m3=LZMA:d22:fb128:mf=bt2  -mb0:1  -mb0s1:2  -mb0s2:3 chrome_child.7z chrome_child.dll

7-Zip [64] 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,40 CPUs)
Scanning

Creating archive chrome_child.7z

Compressing  chrome_child.dll

Everything is Ok
/mnt/c/src/c/src/out/official_lld$ ls -l chrome_child.dll
-rwxrwxrwx 1 root root 76776448 May 11 12:53 chrome_child.dll
/mnt/c/src/c/src/out/official_lld$ ls -l chrome_child.7z
-rwxrwxrwx 1 root root 22919759 May 11 13:41 chrome_child.7z
/mnt/c/src/c/src/out/official_lld$ cd ../official_link
/mnt/c/src/c/src/out/official_link$ 7z a -t7z  -m0=BCJ2 -m1=LZMA:d27:fb128 -m2=LZMA:d22:fb128:mf=bt2 -m3=LZMA:d22:fb128:mf=bt2  -mb0:1  -mb0s1:2  -mb0s2:3 chrome_child.7z chrome_child.dll

7-Zip [64] 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,40 CPUs)
Scanning

Creating archive chrome_child.7z

Compressing  chrome_child.dll

Everything is Ok
/mnt/c/src/c/src/out/official_link$ ls -l chrome_child.7z
-rwxrwxrwx 1 root root 22837535 May 11 13:42 chrome_child.7z
/mnt/c/src/c/src/out/official_link$ ls -l chrome_child.dll
-rwxrwxrwx 1 root root 76966912 May  9 16:34 chrome_child.dll

I'm not aware of any major differences in output at the moment, with the exception of string tail merging. Let's see what happens if I turn that off.

Comment 24 by p...@chromium.org, May 11 2018

Disabling string tail merging does this to the size:

05/11/2018  02:19 PM        46,039,040 mini_installer.exe
05/11/2018  02:18 PM       188,669,211 chrome.7z
05/11/2018  02:19 PM        45,142,913 chrome.packed.7z

So it helps with compressed size: our mini_installer is still larger, but only by 40KB. Our uncompressed size is still smaller by 100KB.

For now I think we should put string tail merging behind a flag and then disable it in Chrome. Later it may be possible to reimplement string tail merging in a way that doesn't disturb section layout as much as the current implementation does, which should hopefully improve compressibility.

Another thing is that we still have more unique addresses in our publics stream than link.exe (currently 331910 in chrome_child.dll) so we may be able to improve ICF to hopefully get us some of that last 40KB.

Comment 25 by p...@chromium.org, May 11 2018

Tail merging flag: https://reviews.llvm.org/D46780

Comment 26 by p...@chromium.org, May 11 2018

To help me find missing ICF opportunities I wrote a program (attached) that takes llvm-pdbutil output and prints the names of symbols with non-unique addresses. One of the things that it told me was that link.exe is ICFing sections with different alignments, and we aren't. It's an easy fix (and was already implemented in the ELF linker) so I should have a patch soon.
main.go
832 bytes View Download

Comment 27 by p...@chromium.org, May 12 2018

Max alignment: https://reviews.llvm.org/D46786

Sizes look like this now:

05/11/2018  04:59 PM        46,011,904 mini_installer.exe
05/11/2018  04:58 PM       188,609,638 chrome.7z
05/11/2018  04:59 PM        45,115,355 chrome.packed.7z

Comment 28 by p...@chromium.org, May 12 2018

That puts us at 330562 unique addresses, which is less than link.exe.

Comment 29 by p...@chromium.org, May 14 2018

All patches have landed as of r332273.

Comment 30 by h...@chromium.org, May 16 2018

I looked at builds with the latest Clang roll (crrev.com/559001) to see where we're at.

args.gn:

is_chrome_branded = true
is_debug = false
is_official_build = true
strip_absolute_paths_from_debug_symbols = true
use_goma = true
use_lld = false

link.exe: 05/16/2018  10:57 AM        50,569,216 mini_installer.exe
lld:      05/16/2018  11:05 AM        50,739,200 mini_installer.exe


Same builds args.gn with an added 'target_cpu = "x86"':

link.exe: 05/16/2018  11:31 AM        51,068,416 mini_installer.exe
lld:      05/16/2018  11:41 AM        51,290,112 mini_installer.exe


Seems a bit unfair given that our DLLs are actually smaller, but trying to emit fluffy but compressible binaries also seems weird so I'm not sure what else we should do..

Comment 31 by brat...@opera.com, May 16 2018

Is the section ordering very different with link.exe and lld? Worse compression can be that similar blocks are further apart in the binary.

But it is not fair considering compression tools are optimized to do as good job as possible on the exact kind of binaries that cl.exe+link.exe generates. Also the constants in CompressUsingLZMA in chrome/tools/build/win/create_installer_archive.py are chosen to work as well as possible with the binaries at the time (2011 judging from the git history).

Comment 32 by h...@chromium.org, May 16 2018

Re-running the measurements with string tail merging disabled as suggested by Nico:

diff --git a/build/config/win/BUILD.gn b/build/config/win/BUILD.gn
index ee8e5e833b0d..1fc409771b13 100644
--- a/build/config/win/BUILD.gn
+++ b/build/config/win/BUILD.gn
@@ -118,6 +118,11 @@ config("compiler") {
       "/FIXED:NO",
     ]

+    if (use_lld) {
+      # XXX
+      ldflags += [ "/OPT:NOLLDTAILMERGE" ]
+    }
+
     # TODO(siggi): Is this of any use anymore?
     # /PROFILE ensures that the PDB file contains FIXUP information (growing the
     # PDB file by about 5%) but does not otherwise alter the output binary. It


64-bit:

link.exe: 05/16/2018  02:49 PM        50,586,624 mini_installer.exe
lld:      05/16/2018  02:59 PM        50,609,664 mini_installer.exe


32-bit:

link.exe: 05/16/2018  11:31 AM        51,068,416 mini_installer.exe
lld:      05/16/2018  01:50 PM        51,156,992 mini_installer.exe
Project Member

Comment 33 by bugdroid1@chromium.org, May 17 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9a07080c87ed7c5ed7dc450c2a7a842fa3ada6e9

commit 9a07080c87ed7c5ed7dc450c2a7a842fa3ada6e9
Author: Hans Wennborg <hans@chromium.org>
Date: Thu May 17 08:45:37 2018

Disable string tail merging when linking with lld

String tail merging leads to smaller binaries, but they don't compress
as well, leading to increased mini_installer size.

This reduces mini_installer.exe size by 130 KB and increases chrome.dll
and chrome_child.dll by 72 KB and 154 KB, respectively for 32-bit (64-bit
is in the same ballpark).

(This depends on Clang/LLD being rolled past r332149.)

Bug:  838449 
Change-Id: I1653e8d522371edb263f742f55ca1aa27ed5dd62
Reviewed-on: https://chromium-review.googlesource.com/1061733
Commit-Queue: Hans Wennborg <hans@chromium.org>
Reviewed-by: Nico Weber <thakis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#559459}
[modify] https://crrev.com/9a07080c87ed7c5ed7dc450c2a7a842fa3ada6e9/build/config/win/BUILD.gn

Comment 34 by h...@chromium.org, May 17 2018

Status: Fixed (was: Assigned)
Let's call this fixed.
Cc: thakis@chromium.org sullivan@chromium.org grt@chromium.org npm@chromium.org
 Issue 844607  has been merged into this issue.

Sign in to add a comment