New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 670397 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Dec 2016
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocking:
issue 670392



Sign in to add a comment

ClangToTLinuxLLD got slower by 25-80%

Project Member Reported by thakis@chromium.org, Dec 1 2016

Issue description

Last green build: https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2108  Compile takes 4h42 min.

Then the llvm build was broken for two builds, and starting at https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2111 the compile step is taking between 5h40min and 8h (https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2115)

llvm regression range r288198:288231.

The only lld change in that range is r288228, so it's probably due to that.
 
Blocking: 670392

Comment 2 by ruiu@google.com, Dec 1 2016

Did they just time out? I didn't see error messages in the logs.
Yes, I think they just timed out. The build steps definitely ran for much longer than previously before stopping. (It's a bit suspicious that they all stopped at different times though.)
sanitizer-x86_64-linux-bootstrap now consistently timesout: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/323

I believe it's related to LLD, based on the error message:
[2545/2558] Linking CXX executable bin/lld

command timed out: 1200 seconds without output, attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=2422.509514

Note: i have not really checked which linker is being used by the bots (on it), but it's at the very least timing out linking lld. :)
Please, ignore #4. The bot had a few clang++ processes, some of which have been running for 90+ minutes. Not sure how it has got into the state, but it's clear that LLD has nothing to do with that (at least, I see no connections to the lld being slow on itself).

Comment 6 by ruiu@google.com, Dec 2 2016

I think this is a real performance regression that the new data structure works very poorly for some input. I'm working on fixing it.
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161128/409305.html / seem to address the slowdown, thanks! However, the bot can't verify the fix since the bot now has lld crashing (very quickly though):

https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2117/steps/compile/logs/stdio

[484/40468] LINK ./dump_syms
FAILED: dump_syms 
../../third_party/llvm-build/Release+Asserts/bin/clang++ -Wl,--fatal-warnings -fPIC -Wl,-z,noexecstack -Wl,-z,now -Wl,-z,relro -Wl,-z,defs -Wl,--no-as-needed -lpthread -Wl,--as-needed -fuse-ld=lld -Wl,--icf=all -m64 -pthread -Wl,-O1 -Wl,--gc-sections --sysroot=../../build/linux/debian_wheezy_amd64-sysroot -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/lib/x86_64-linux-gnu -Wl,-rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/lib/x86_64-linux-gnu -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/x86_64-linux-gnu -Wl,-rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/x86_64-linux-gnu -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6 -Wl,-rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6 -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib -Wl,-rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib -Wl,-rpath-link=. -Wl,--disable-new-dtags -Wl,-rpath=\$ORIGIN/. -Wl,-rpath-link=. -Werror -o "./dump_syms" -Wl,--start-group @"./dump_syms.rsp"  -Wl,--end-group  -ldl -lrt 
ld.lld: /b/c/b/ClangToTLinuxLLD/src/third_party/llvm/tools/lld/ELF/ICF.cpp:256: {anonymous}::ICF<ELFT>::variableEq(const lld::elf::InputSection<ELFT>*, llvm::ArrayRef<U>, const lld::elf::InputSection<ELFT>*, llvm::ArrayRef<U>) [with RelTy = llvm::object::Elf_Rel_Impl<llvm::object::ELFType<(llvm::support::endianness)1u, true>, true>; ELFT = llvm::object::ELFType<(llvm::support::endianness)1u, true>]::__lambda10: Assertion `X->Color[Idx] != 0 && Y->Color[Idx] != 0' failed.
#0 0x00000000006a2fc5 llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x6a2fc5)
#1 0x00000000006a10ce llvm::sys::RunSignalHandlers() (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x6a10ce)
#2 0x00000000006a1232 SignalHandler(int) (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x6a1232)
#3 0x00007f1bd8782330 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x10330)
#4 0x00007f1bd7376c37 gsignal /build/eglibc-oGUzwX/eglibc-2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56:0
#5 0x00007f1bd737a028 abort /build/eglibc-oGUzwX/eglibc-2.19/stdlib/abort.c:91:0
#6 0x00007f1bd736fbf6 __assert_fail_base /build/eglibc-oGUzwX/eglibc-2.19/assert/assert.c:92:0
#7 0x00007f1bd736fca2 (/lib/x86_64-linux-gnu/libc.so.6+0x2fca2)
#8 0x0000000000552821 (anonymous namespace)::ICF<llvm::object::ELFType<(llvm::support::endianness)1, true> >::segregate((anonymous namespace)::Range*, bool)::{lambda(lld::elf::InputSection<llvm::object::ELFType<(llvm::support::endianness)1, true> >*)#1}::operator()(lld::elf::InputSection<llvm::object::ELFType<(llvm::support::endianness)1, true> >*) const (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x552821)
#9 0x00000000005586b3 (anonymous namespace)::ICF<llvm::object::ELFType<(llvm::support::endianness)1, true> >::segregate((anonymous namespace)::Range*, bool) (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x5586b3)
#10 0x0000000000558b61 void lld::elf::doIcf<llvm::object::ELFType<(llvm::support::endianness)1, true> >() (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x558b61)
#11 0x0000000000549559 void lld::elf::LinkerDriver::link<llvm::object::ELFType<(llvm::support::endianness)1, true> >(llvm::opt::InputArgList&) (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x549559)
#12 0x000000000049aea2 lld::elf::LinkerDriver::main(llvm::ArrayRef<char const*>, bool) (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x49aea2)
#13 0x000000000054cf7c lld::elf::link(llvm::ArrayRef<char const*>, bool, llvm::raw_ostream&) (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x54cf7c)
#14 0x000000000049a16c main (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x49a16c)
#15 0x00007f1bd7361f45 __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:321:0
#16 0x00000000004fe1f2 _start (/b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld+0x4fe1f2)
Stack dump:
0.	Program arguments: /b/c/b/ClangToTLinuxLLD/src/out/Release/../../third_party/llvm-build/Release+Asserts/bin/ld.lld --sysroot=../../build/linux/debian_wheezy_amd64-sysroot -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o ./dump_syms ../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crt1.o ../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crti.o ../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6/crtbegin.o -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/lib/x86_64-linux-gnu -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/x86_64-linux-gnu -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6 -L/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib -L../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6 -L../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu -L../../build/linux/debian_wheezy_amd64-sysroot/lib/x86_64-linux-gnu -L../../build/linux/debian_wheezy_amd64-sysroot/lib/../lib64 -L../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/x86_64-linux-gnu -L../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6/../../.. -L../../build/linux/debian_wheezy_amd64-sysroot/lib -L../../build/linux/debian_wheezy_amd64-sysroot/usr/lib --fatal-warnings -z noexecstack -z now -z relro -z defs --no-as-needed -lpthread --as-needed --icf=all -O1 --gc-sections -rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/lib/x86_64-linux-gnu -rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/x86_64-linux-gnu -rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6 -rpath-link=/b/c/b/ClangToTLinuxLLD/src/build/linux/debian_wheezy_amd64-sysroot/usr/lib -rpath-link=. --disable-new-dtags -rpath=$ORIGIN/. -rpath-link=. --start-group obj/breakpad/dump_syms/bytereader.o obj/breakpad/dump_syms/dwarf2diehandler.o obj/breakpad/dump_syms/dwarf2reader.o obj/breakpad/dump_syms/elf_reader.o obj/breakpad/dump_syms/dwarf_cfi_to_module.o obj/breakpad/dump_syms/dwarf_cu_to_module.o obj/breakpad/dump_syms/dwarf_line_to_module.o obj/breakpad/dump_syms/language.o obj/breakpad/dump_syms/crc32.o obj/breakpad/dump_syms/dump_symbols.o obj/breakpad/dump_syms/elf_symbols_to_module.o obj/breakpad/dump_syms/elfutils.o obj/breakpad/dump_syms/file_id.o obj/breakpad/dump_syms/linux_libc_support.o obj/breakpad/dump_syms/memory_mapped_file.o obj/breakpad/dump_syms/module.o obj/breakpad/dump_syms/stabs_reader.o obj/breakpad/dump_syms/stabs_to_module.o obj/breakpad/dump_syms/dump_syms.o --end-group -ldl -lrt -lstdc++ -lm -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc ../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6/crtend.o ../../build/linux/debian_wheezy_amd64-sysroot/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crtn.o 
clang-4.0: error: unable to execute command: Aborted (core dumped)
clang-4.0: error: linker command failed due to signal (use -v to see invocation)


This started happening in https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2117 so the LLVM regression range for that crash is r288364:288435


The failing assert was added in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161128/409181.html / r288419, so it's due to that, but the commit message says this shouldn't happen, so the bug is probably somewhere else.

Comment 8 by ruiu@google.com, Dec 2 2016

That was caused by a wrong assert(). Committed r288511 to fix it.
Project Member

Comment 9 by bugdroid1@chromium.org, Dec 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7b18ff885c40874f8e4a417ad03376161b19de0a

commit 7b18ff885c40874f8e4a417ad03376161b19de0a
Author: ruiu <ruiu@google.com>
Date: Sat Dec 03 02:20:25 2016

Disable threads on lld trunk bots to see if they hurt build time.

BUG= 670397 

Review-Url: https://codereview.chromium.org/2550883002
Cr-Commit-Position: refs/heads/master@{#436139}

[modify] https://crrev.com/7b18ff885c40874f8e4a417ad03376161b19de0a/build/config/compiler/BUILD.gn

Comment 10 by ruiu@google.com, Dec 4 2016

Now the bot is green, but compile step still takes 47 minutes longer than before. Compare  https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2139 and https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2108.

I landed a patch to LLD that should improve performance for single-thread execution (https://reviews.llvm.org/rL288620) to see how it will work on the bot.
https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2135 and https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2136 were after the crash fix r288511. Builds there (just chromium's compile step) take 6h10 min. https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2137 is the first build with multithreading disabled again, build time there is 5h56

https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2138 has 5h30 and it stayed in that ballpark since. From https://build.chromium.org/p/chromium.fyi/stats/ClangToTLinuxLLD

https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2141 has your 288620 (look for "got_clang_revision" on that page) and it takes 5h56 again. I'm not sure if that means that your patch hurt or if that's just noise. We'll know tomorrow once there have been a few more builds.

(I also noticed on the stats page that runhooks time (that's where we build llvm and run all its tests) got a lot worse, but I'm not seeing that on the other bot. But I could see how multithreaded lld could be slower when running many tests in parallel. The last fast build was https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2111 with 35m runhooks at llvm 288231; https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2117 and later has much slower runhooks with 55m at llvm 288435. The range does include the lld parallelization commit.)
Screen Shot 2016-12-04 at 7.05.22 PM.png
69.3 KB View Download
(runhooks graph)
Screen Shot 2016-12-04 at 7.16.07 PM.png
82.8 KB View Download

Comment 13 by ruiu@google.com, Dec 5 2016

With threading disabled, ICF's algorithm is the same as before, so I don't really understand what's going on in the bot. Can I login to the bot and run the command myself to see which pass in the linker specifically takes too much time?
Yes. krasin@ sits close to you and knows how to do this well, he can probably show you how.
Hmm, I think the tot bots don't actually run llvm tests, so tests getting slower due to lld changing wouldn't explain runhooks slowdowns on the bot. https://build.chromium.org/p/chromium.fyi/stats/ClangToTLinuxLLD has a bunch of other things getting slower, e.g. update_scripts, isolate_tests, so maybe the remaining slowdown is in fact independent of lld. If single-thread behavior is identical to the previous behavior (including memory use), then we could handwave that this is a VM and some neighboring VM on the same box got busier.

If that's our theory, then a) we could even try removing the single-thread-flag again b) this bug is fixed now.
(I confirmed that at least locally on my box, `time ninja check-all` takes about the same time at 288231 and 288435 with llvm clang lld compiler-rt clang-tools-extra checked out)

Comment 17 by ruiu@google.com, Dec 5 2016

The current single-thread behavior is the same as before including memory usage, unless there's a performance new bug. Can we move this VM to a physically different machine?
Project Member

Comment 18 by bugdroid1@chromium.org, Dec 6 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ffa1f105690aea99deecbdfff6eb40128604b73a

commit ffa1f105690aea99deecbdfff6eb40128604b73a
Author: ruiu <ruiu@google.com>
Date: Tue Dec 06 03:55:59 2016

Re-enable threads on ClangToTLinuxLLD.

I added -no-threads flag to LLD in 7b18ff885c40874f8e4a417ad03376161b19de0a,
but it turned out that the slowness of the bot is generally slow,
and it is unlikely that LLD's change is the cause of the problem.
This patch restore the original setting.

BUG= 670397 

Review-Url: https://codereview.chromium.org/2556503003
Cr-Commit-Position: refs/heads/master@{#436517}

[modify] https://crrev.com/ffa1f105690aea99deecbdfff6eb40128604b73a/build/config/compiler/BUILD.gn

Comment 19 by ruiu@google.com, Dec 6 2016

The last build succeeded in 5 hours and 21 minutes.

https://build.chromium.org/p/chromium.fyi/builders/ClangToTLinuxLLD/builds/2147

Maybe we can close this? I'm not totally convinced that the slowness is due to the VM thing, but it seems it wasn't at least caused by my change to LLD.
Status: Fixed (was: Untriaged)
Let's close this. I think the slowdown fixed by r288480 was real and due to an lld change.
Filed bug 671670 for getting a barebones box.

Sign in to add a comment