New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 716209 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

LLD is 3.2x slower when linking Chrome with FullLTO than Gold

Project Member Reported by krasin@chromium.org, Apr 27 2017

Issue description

Chrome Version: tip
OS: Linux x86-64

What steps will reproduce the problem?
(1) Build chrome with Gold:

$ gn gen out/gold '--args=is_chrome_branded=true is_official_build=true is_debug=false is_cfi=true use_lld=false allow_posix_link_time_opt=true' --check
$ ninja -C out/gold chrome

(2) Build chrome with LLD:

$ gn gen out/lld '--args=is_chrome_branded=true is_official_build=true is_debug=false is_cfi=true use_lld=true allow_posix_link_time_opt=true' --check
$ ninja -C out/lld chrome

(3) Compare link times:

$ cat out/gold/.ninja_log | grep -i "chrome[^/._a-z-]"
{start_time_in_ms}    {end_time_in_ms} 0       chrome  3a8f93a3257da044
$ cat out/lld/.ninja_log | grep -i "chrome[^/._a-z-]"
{start_time_in_ms}    {end_time_in_ms} 0       chrome  3b1e8419b639d435


What is the expected result?

LLD and Gold have the same performance. If LLD is slightly faster, that's a nice bonus.

What happens instead?

LLD is 3.2x slower. It takes 5537 seconds vs 1704 in the Gold case.

 

Comment 1 by krasin@chromium.org, Apr 27 2017

Cc: ruiu@google.com h...@chromium.org kcc@chromium.org thakis@chromium.org p...@chromium.org
Summary: LLD is 3.2x slower when linking Chrome with FullLTO than Gold (was: LLD is 3.2x slower when linking Chrome)

Comment 2 by krasin@chromium.org, Apr 27 2017

Previous measurements didn't show anything wrong (Dec 2016): https://bugs.chromium.org/p/chromium/issues/detail?id=672158

Comment 3 by krasin@chromium.org, Apr 27 2017

Tested the performance of linking cc_unittests:

Gold: 280s
LLD: 322s

The 15% difference might be explained by the fact that Gold is linked with tcmalloc, and LLD is not. Last time is provided speed up of 25%, see  https://crbug.com/583551 , so LLD might be actually faster is linked with tcmalloc.

Which makes me wonder what's wrong with chrome target. The primary suspect is its size, and an O(n^2) algorithm somewhere inside may only be visible on larger targets.

I will build a ToT toolchain and try to collect profile for LLD.

Comment 4 by ruiu@google.com, Apr 27 2017

This seems still odd to me as LLD is usually 2x or even 3x faster than gold.

Comment 5 by krasin@chromium.org, Apr 27 2017

I am as surprised as you (if not more) as I have found this as a last-minute safety check before enabling LLD for all LTO configurations on Chrome Linux: https://codereview.chromium.org/2844883002/ (obviously, I decided not to do so)

Comment 6 by krasin@chromium.org, Apr 27 2017

I am working towards having a profile. Will take some time.

Comment 7 by krasin@chromium.org, Apr 28 2017

N^2 found. It's inside void lld::elf::writeMapFile<llvm::object::ELFType<(llvm::support::endianness)1, true> >(llvm::ArrayRef<lld::elf::OutputSection*>).

It's a recent regression in a sense that it only hits when -Map is specified *and* we have a lot of sections (true for LTO case, as we just enabled function and data sections)

Comment 8 by krasin@chromium.org, Apr 28 2017

Cc: agrieve@chromium.org
-Map was added in https://codereview.chromium.org/2726983004 by agrieve@

Comment 9 by krasin@chromium.org, Apr 28 2017

Function sections and data sections were enabled by pcc@ in https://codereview.chromium.org/2840723003/ (just 2 days ago)
Actions to be taken:

1. Rui (ruiu@) promised to implement a more optimal way to generate Map files that does not have O(n^2) complexity

2. We need to understand why does this map file is even needed. Andrew, can we avoid doing this by default?
I've written a tool for collecting binary size information from .map files:
https://cs.chromium.org/chromium/src/tools/binary_size/README.md

It allows diffing the set of symbols between two builds, and also running queries over the set of all symbols.

The original version of the tool gets all of the symbol data from the linker's .map file, but I recently found you can get most of it from nm as well. But still, what .map has that nm does not:
 - Entries for ** merge strings
 - Object path attribution for anonymous symbols (.map shows a symbol with object path)

Our focus so far has mostly been for Android, but it's been nice to verify that the tool works as well on Linux (although I haven't tested with lld). If this is blocking linux in any way though, disabling would be fine. If we ever get around to caring about Linux binary size, I may try an turn the .map generation back on, or just live with the slightly reduced accuracy of what nm provides.
As an aside (since I might have the right audience for this question): Does anyone know why linker maps contain symbols like:

 .text._ZN5blink18PerformanceMonitor4WillERKNS_5probe12UpdateLayoutE
                0xffffffffffffffff       0x2a obj/third_party/WebKit/Source/core/frame/libframe.a(PerformanceMonitor.o)
                0x00695e19                blink::PerformanceMonitor::Will(blink::probe::UpdateLayout const&)
 .text._ZN5blink9FrameView21PerformPreLayoutTasksEv
                0x006a454c      0x14c obj/third_party/WebKit/Source/core/frame/libframe.a(FrameView.o)
                0x006a454d                blink::FrameView::PerformPreLayoutTasks()

Looking through the code a bit here: https://github.com/gittup/binutils/blob/HEAD/gold/mapfile.cc
There's just a comment that "special handling" is required. 

There seems to be little consistency between builds as to where these symbols occur, and they come with large gaps. Furthermore, we had a recent perf alert ( bug 709050 ), where a gap grew by 60kb as a result of changing the order in which .o files are fed to the linker. 


The list of them for libchrome.so from my build today (via supersize tool):

t@0x6a454c   59146   third_party/WebKit/Source/core/frame/FrameView.cpp
             ** symbol gap 0
t@0xaa0a6c   34096   third_party/icu/source/i18n/nfsubs.cpp
             ** symbol gap 1
t@0xe949c0   672     device/bluetooth/bluetooth_remote_gatt_characteristic.cc
             ** symbol gap 2
t@0x1288cb0  760     v8/src/compiler/js-operator.cc
             ** symbol gap 3
t@0x1683fb8  29470   third_party/WebKit/Source/web/WebAXObject.cpp
             ** symbol gap 4
t@0x1a8f2c6  95038   third_party/libvpx/source/libvpx/vp9/encoder/vp9_multi_thread.c
             ** symbol gap 5
t@0x1e97878  83390   components/invalidation/impl/invalidation_logger.cc
             ** symbol gap 6
t@0x22aeb44  144152  third_party/WebKit/Source/core/input/TouchEventManager.cpp
             ** symbol gap 7

The second field is the size. Together, about 440kb! 

The nm output for address range of symbol gap 0 show:

00695e19 t blink::PerformanceMonitor::Will(blink::probe::UpdateLayout const&)
00696004 N $d
00697390 N $d
0069743c N $d
00697500 N $d
0069799c N $d
006979bc N $d
00697a74 N $d
00697d10 N $d
00697f34 N $d
00697f9c N $d
006984a8 N $d
006985c4 N $d
00698980 N $d
0069945c N $d
00699658 N $d
00699718 N $d
006998a8 N $d
0069995c N $d
006999cc N $d
0069a6fc N $d
0069a778 N $d
0069a908 N $d
0069a9c0 N $d
0069aaa4 N $d
0069b178 N $d
0069b578 N $d
0069b830 N $d
0069bb50 N $d
0069bbe8 N $d
0069c904 N $d
0069d1bc N $d
0069ea94 N $d
0069eb74 N $d
006a00d4 N $d
006a02c8 N $d
006a0990 N $d
006a1174 N $d
006a1edc N $d
006a21e4 N $d
006a2204 N $d
006a2238 N $d
006a28f0 N $d
006a29f0 N $d
006a2b40 N $d
006a2d74 N $d
006a30cc N $d
006a3298 N $d
006a3394 N $d
006a3490 N $d
006a34f8 N $d
006a3cd8 N $d
006a3d6c N $d
006a454c t $t
006a454d t blink::FrameView::PerformPreLayoutTasks()

According to nm's manpage, "N" is a debugging symbol. 

So.. yeah.. anyone have insights or know the best place to ask?

Comment 13 by p...@chromium.org, Apr 28 2017

Rui landed a fix for the map file generator in r301659.
Re: #12.

Andrew, this is from the regular, non-LTO build correct? (LTO = LinkTimeOptimization).

Comment 15 by ruiu@google.com, Apr 28 2017

I believe that will fix the issue that the feature significantly slowed down the build, but it is still better to remove the -Map option from the command line because 1) the existing script that reads -Map outputs will stop working after switching to LLD because LLD's -Map format is different from gold's, and 2) even if the -Map option is not very slow it is still fairly heavy option as it generates a hundreds megabytes of text file. Always adding that option is probably not a good idea.
Hi Rui, thank you for fixing it!

Yes, I am disabling it for now for exactly these reasons: https://codereview.chromium.org/2849583005/ (also, because we'll only get your fix only after the next Clang roll; want to make the switch to LLD before that)
re #15 - the size of the files isn't too bad since the build rules gzips it. For libchrome.so, the .map.gz is ~24MB for a 630MB .so file. The fact that it will break for lld is a good enough reason though.

re #14 - This is for an android build. So, gcc+gold without lto.

Comment 18 by ruiu@google.com, Apr 28 2017

Sorry, I meant for the linker (or any other program) it is not cheap to produce a hundreds of megabytes of text data (in the uncompressed form) containing tons of demangled symbols because symbol demangling is fairly complicated and slow.
Re #17: just an fyi, but chrome build with LinkTimeOptimization is 2.5 GB.

Re #18: can linker do this concurrently?

Comment 20 by ruiu@google.com, Apr 28 2017

Yes, and that's my most recent patch does.
Yikes! 2.5 GB is quite the jump from 225MB (which is what Android's currently is non-gzipped).

If lld is not going to follow the format of ld / gold's linker file, maybe this is an opportunity to make a smaller format? E.g. don't bother with demangling, separate out discarded sections as it's own --option, skip outputting debug sections from section map (maybe behind --option)
Re #20: I have only now taken a look at your fix (https://reviews.llvm.org/rL301659). Thank you for your forward thinking.

Re #25: this is before strip; after strip is around 105 MB, I believe. Will look up the real number in a sec.
From sizes step on the bot:
https://build.chromium.org/p/chromium.chrome/builders/Google%20Chrome%20Linux%20x64/builds/17583

RESULT chrome: chrome= 1922789048 bytes
RESULT chrome-stripped: stripped= 118788016 bytes
RESULT chrome-text: text= 112688145 bytes
RESULT chrome-data: data= 6094976 bytes
RESULT chrome-bss: bss= 1982577 bytes
RESULT chrome-si: initializers= 8 files
RESULT chrome-textrel: textrel= 0 relocs

It's 2 GB before strip and 118 MB after.

Comment 25 by ruiu@google.com, Apr 28 2017

Re #21: You can disable demangling by passing --no-demangle to the linker, but I'm not sure if you want it. I don't think we want to change the format only to reduce the size, as the format was chosen to be readable for human.
aaah, okay, I thought you were saying the .map file was 2.5GB. 
Re #26: no, I meant the binary size. Sorry for the confusion. :)
Project Member

Comment 28 by bugdroid1@chromium.org, Apr 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/317d9edab7ba87a5c05564ca8c66a12911c46872

commit 317d9edab7ba87a5c05564ca8c66a12911c46872
Author: krasin <krasin@chromium.org>
Date: Fri Apr 28 20:34:01 2017

Disable linker map for Clang toolchains.

Right now, we are transitioning to LLD, and LLD has a different
map format (so, the analysis tool will need to be updated), as well as
suboptimal implementation that adds 60 minutes to the Chrome link time.

This will be reenabled when both of the blockers above are resolved.

BUG= 716209 , 607968 

Review-Url: https://codereview.chromium.org/2849583005
Cr-Commit-Position: refs/heads/master@{#468113}

[modify] https://crrev.com/317d9edab7ba87a5c05564ca8c66a12911c46872/build/toolchain/gcc_toolchain.gni

Project Member

Comment 29 by bugdroid1@chromium.org, May 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0f24feacbffe77af7d69f64e739c73d834fa19b4

commit 0f24feacbffe77af7d69f64e739c73d834fa19b4
Author: estevenson <estevenson@chromium.org>
Date: Thu May 18 16:17:42 2017

Use a GN arg for controlling linker map file generation.

Map files aren't generated on Linux by default now that linker map file
generation is hooked into !use_lld ( https://crbug.com/716209 ).

This CL adds a GN arg, enable_linker_map, that defaults to true for
Android and makes it easy to generate map files on Linux.

BUG= 717550 , 716209 

Review-Url: https://codereview.chromium.org/2888623003
Cr-Commit-Position: refs/heads/master@{#472834}

[modify] https://crrev.com/0f24feacbffe77af7d69f64e739c73d834fa19b4/build/toolchain/gcc_toolchain.gni
[modify] https://crrev.com/0f24feacbffe77af7d69f64e739c73d834fa19b4/build/toolchain/toolchain.gni
[modify] https://crrev.com/0f24feacbffe77af7d69f64e739c73d834fa19b4/tools/binary_size/diagnose_bloat.py

Comment 30 by p...@chromium.org, May 18 2017

Cc: -p...@chromium.org
Owner: p...@chromium.org
Status: Fixed (was: Assigned)
Fixed as reported in https://bugs.chromium.org/p/chromium/issues/detail?id=607968#c80 .
Project Member

Comment 31 by bugdroid1@chromium.org, Nov 20

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9d63558169c2945762c05b2809200736a40dcee2

commit 9d63558169c2945762c05b2809200736a40dcee2
Author: Daniel Bratell <bratell@opera.com>
Date: Tue Nov 20 14:10:57 2018

Update some documentation for the binary_size tool

The documentation didn't include the crucial information
that generate_linker_map has to be set to true in gn for the tool
to work in Linux.

Bug:  716209 
Change-Id: I62919e26a80a0b5d592f6b8953762e35575142d2
Reviewed-on: https://chromium-review.googlesource.com/c/1338085
Reviewed-by: agrieve <agrieve@chromium.org>
Commit-Queue: Daniel Bratell <bratell@opera.com>
Cr-Commit-Position: refs/heads/master@{#609686}
[modify] https://crrev.com/9d63558169c2945762c05b2809200736a40dcee2/build/toolchain/toolchain.gni
[modify] https://crrev.com/9d63558169c2945762c05b2809200736a40dcee2/tools/binary_size/README.md
[modify] https://crrev.com/9d63558169c2945762c05b2809200736a40dcee2/tools/binary_size/libsupersize/archive.py

Sign in to add a comment