Reduce overhead of using Clang w/ asserts |
|||
Issue descriptionWe use Clang with asserts enabled to compensate for using a version that's very close to trunk. The overhead of the asserts isn't supposed to be that high, but it's been reported to make compiles 20% slower, and we should investigate that.
,
Oct 17
(dpranke: This has been discussed many times in lots of detail. If you're super curious I can invest 15 minutes to find all the places, but in general we're not interested in reopening that discussion, just making the assert overhead smaller. Something < 10% seems reasonable to us.)
,
Oct 17
I would be curious to read at least some of the discussions. I don't necessarily mean to reopen the discussion.
,
Oct 30
Issue 535406 is probably the main discussion, but there have been a few follow-on discussions since that I can try to find as well. The summary goes like this: - We need to build with an assertion-enabled clang/lld _somewhere_, since they find a lot of stuff. - Building with assertions just on the tot bots isn't enough: That proves that things are fine right at the time of a clang roll, but when chrome code changes after a clang roll we're in unchecked territory. - So we'd need to have both +asserts and -asserts packages and at least some bots that build with the +assert packages. These will have separate goma caches, and they'd be a separate config to support, and sometimes asserts will fire on them that devs don't see locally or on tryjobs.
,
Nov 9
,
Nov 18
LLVM_ENABLE_ASSERTIONS by default also enables LLVM_ENABLE_ABI_BREAKING_CHECKS, but we could set -DLLVM_ENABLE_ABI_BREAKING_CHECKS=FORCE_OFF to turn those off independent of asserts. If that's noticeably faster, I could live with that -- those abi checks are imho less valuable than other asserts.
,
Nov 19
They're not "abi checks" though, they're asserts that happen to affect the abi (e.g. by adding sentinel checks to ilist_node). So even if did affect performance much, I'm not sure how we'd motivate turning those off and not other asserts..
,
Nov 19
Well, I suppose if they were significantly more expensive than asserts overall, that would be a good reason. Anyway, let me see if turning them off makes any difference..
,
Nov 19
I compared compile-time of stroke_opacity_custom.cc (cross-compiled linux -> win) with and without these assertions and saw no change.
Regular build:
9309.763870 task-clock:u (msec) # 0.986 CPUs utilized ( +- 0.59% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
89,175 page-faults:u # 0.010 M/sec ( +- 0.12% )
30,092,665,047 cycles:u # 3.232 GHz ( +- 0.65% )
31,715,066,123 instructions:u # 1.05 insn per cycle ( +- 0.02% )
7,207,990,090 branches:u # 774.240 M/sec ( +- 0.02% )
151,286,073 branch-misses:u # 2.10% of all branches ( +- 0.08% )
9.446148910 seconds time elapsed ( +- 0.61% )
Configured with -DLLVM_ENABLE_ABI_BREAKING_CHECKS=FORCE_OFF
9181.302686 task-clock:u (msec) # 0.971 CPUs utilized ( +- 0.85% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
92,094 page-faults:u # 0.010 M/sec ( +- 3.36% )
30,145,104,576 cycles:u # 3.283 GHz ( +- 0.61% )
31,728,498,636 instructions:u # 1.05 insn per cycle ( +- 0.03% )
7,211,667,397 branches:u # 785.473 M/sec ( +- 0.04% )
151,756,953 branch-misses:u # 2.10% of all branches ( +- 0.19% )
9.458042475 seconds time elapsed ( +- 1.43% )
So no difference really.
> LLVM_ENABLE_ASSERTIONS by default also enables LLVM_ENABLE_ABI_BREAKING_CHECKS
Hang on..
cmake/modules/HandleLLVMOptions.cmake looks like this:
if( uppercase_LLVM_ABI_BREAKING_CHECKS STREQUAL "WITH_ASSERTS" )
if( LLVM_ENABLE_ASSERTIONS )
set( LLVM_ENABLE_ABI_BREAKING_CHECKS 1 )
endif()
elseif( uppercase_LLVM_ABI_BREAKING_CHECKS STREQUAL "FORCE_ON" )
set( LLVM_ENABLE_ABI_BREAKING_CHECKS 1 )
elseif( uppercase_LLVM_ABI_BREAKING_CHECKS STREQUAL "FORCE_OFF" )
# We don't need to do anything special to turn off ABI breaking checks.
elseif( NOT DEFINED LLVM_ABI_BREAKING_CHECKS )
# Treat LLVM_ABI_BREAKING_CHECKS like "FORCE_OFF" when it has not been
# defined.
else()
message(FATAL_ERROR "Unknown value for LLVM_ABI_BREAKING_CHECKS: \"${LLVM_ABI_BREAKING_CHECKS}\"!")
endif()
So it sounds more like it's actually off by default.
,
Nov 19
Doesn't this mean the default value is WITH_ASSERTS? set(LLVM_ABI_BREAKING_CHECKS "WITH_ASSERTS" CACHE STRING "Enable abi-breaking checks. Can be WITH_ASSERTS, FORCE_ON or FORCE_OFF.") http://llvm-cs.pcc.me.uk/CMakeLists.txt#378
,
Nov 19
Can confirm that they are on by default when you build with asserts. I tried disabling them, got the same number (as hans above) and realised they were still on so hans might want to double check they were actually turned off.
,
Nov 19
(When changing cmake settings, you need to create a new build dir for each setting to make sure the new value of the setting really takes. cmake really likes its caches.)
,
Nov 19
I think this patch also helps: Index: HandleLLVMOptions.cmake =================================================================== --- HandleLLVMOptions.cmake (revision 346388) +++ HandleLLVMOptions.cmake (working copy) @@ -85,7 +85,7 @@ elseif( uppercase_LLVM_ABI_BREAKING_CHECKS STREQUAL "FORCE_ON" ) set( LLVM_ENABLE_ABI_BREAKING_CHECKS 1 ) elseif( uppercase_LLVM_ABI_BREAKING_CHECKS STREQUAL "FORCE_OFF" ) - # We don't need to do anything special to turn off ABI breaking checks. + unset( LLVM_ENABLE_ABI_BREAKING_CHECKS CACHE ) elseif( NOT DEFINED LLVM_ABI_BREAKING_CHECKS ) # Treat LLVM_ABI_BREAKING_CHECKS like "FORCE_OFF" when it has not been # defined. Or maybe it was that I had a stale cache and that change forced it to be rebuilt. It seems right though, LLVM_ABI_BREAKING_CHECKS is set to "FORCE_OFF", and is run through cmake's cmakedefine01 which says that it will emit 1 for "any value not considered a false constant by the if() command" and I guess "FORCE_OFF" is not a false constant. (CMake is not something I'm familiar with so any mistakes are purely my own)
,
Nov 19
> Doesn't this mean the default value is WITH_ASSERTS? > > set(LLVM_ABI_BREAKING_CHECKS "WITH_ASSERTS" CACHE STRING > "Enable abi-breaking checks. Can be WITH_ASSERTS, FORCE_ON or FORCE_OFF.") > > http://llvm-cs.pcc.me.uk/CMakeLists.txt#378 Ah, you're right. I missed that in my grepping. And, I -D'd the wrong thing on my cmake invocation. It needs to be set like this: $ cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ABI_BREAKING_CHECKS=FORCE_OFF .. To confirm, check build_dir/include/llvm/Config/abi-breaking.h I'll have new numbers in a bit, sorry for the noise.
,
Nov 19
Still no measurable difference:
9191.545840 task-clock:u (msec) # 0.970 CPUs utilized ( +- 0.43% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
96,489 page-faults:u # 0.010 M/sec ( +- 7.61% )
30,152,232,949 cycles:u # 3.280 GHz ( +- 0.26% )
31,293,689,666 instructions:u # 1.04 insn per cycle ( +- 0.15% )
7,145,305,467 branches:u # 777.378 M/sec ( +- 0.16% )
152,328,759 branch-misses:u # 2.13% of all branches ( +- 0.09% )
9.471901510 seconds time elapsed ( +- 1.92% )
,
Nov 19
Let's try a -DLLVM_ENABLE_ASSERTIONS=OFF build for comparison:
7037.941742 task-clock:u (msec) # 0.966 CPUs utilized ( +- 0.60% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
86,597 page-faults:u # 0.012 M/sec ( +- 0.14% )
22,994,496,026 cycles:u # 3.267 GHz ( +- 0.62% )
23,494,053,013 instructions:u # 1.02 insn per cycle ( +- 0.10% )
4,846,974,209 branches:u # 688.692 M/sec ( +- 0.08% )
128,458,309 branch-misses:u # 2.65% of all branches ( +- 0.22% )
7.284711702 seconds time elapsed ( +- 1.17% )
Now that's a measurable difference: 23% faster. That would certainly be nice to reduce.
,
Nov 20
Looking at this today, because now I'm curious :-) The numbers are from using Chromium #608755 and llvm/clang r347296, compiling stroke_opacity_custom.cc in a release build targeting Windows from a Linux host. (As a benchmark, that's very front-end heavy; not much code is generated. I think it makes sense focusing on this first though.) LLVM_ENABLE_ASSERTIONS affects both assert()'s, as well as other checks gated behind #ifndef NDEBUG. How much do the latter contribute? Compiling with -DLLVM_ENABLE_ASSERTIONS=OFF: 7.398081027 ( +- 1.52% ) with only actual assert()'s hacked out: 7.410520301 ( +- 1.83% ) So no difference. I'll focus on just the assert()'s from now on.
,
Nov 20
Substituting assert(foo) for my custom macro, MYASSERT(nbr, foo), where nbr is a number unique for each assert():
$ perl -pi -e 's/\bassert\(/"MYASSERT(" . $nbr++ . ", "/ge; END { print "nbr: $nbr\n" }' `git grep -lI "\bassert("`
nbr: 105324
Force-including the file that contains the definition:
diff --git a/llvm/CMakeLists.txt b/llvm/CMakeLists.txt
index cd1e1966b08..54d9ee8cd57 100644
--- a/llvm/CMakeLists.txt
+++ b/llvm/CMakeLists.txt
@@ -822,6 +822,9 @@ set(CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS "")
set(LLVM_PROFDATA_FILE "" CACHE FILEPATH
"Profiling data file to use when compiling in order to improve runtime performance.")
+
+add_definitions("-include /work/llvm.combined/assert_hack.h")
+
if(LLVM_PROFDATA_FILE AND EXISTS ${LLVM_PROFDATA_FILE})
if ("${CMAKE_CXX_COMPILER_ID}" MATCHES "Clang" )
add_definitions("-fprofile-instr-use=${LLVM_PROFDATA_FILE}")
For now I'm interested in what asserts() are invoked the most.
In assert_hack.h, defining MYASSERT to print each time it's invoked:
#ifndef ASSERT_HACK_H
#define ASSERT_HACK_H
#include <stdio.h>
#define MYASSERT(nbr, expr) (fprintf(stderr, "ASSERT#%d\t%s:%d\n", nbr, __FILE__, __LINE__), assert(expr))
#endif
(I probably should add a way to turn the printing on/off at run-time, because this makes building clang take forever due to the assert printfs from tablegen.)
,
Nov 20
My wild guess is that the assert in the 'cast' template is the most expensive one, but it's also perhaps the most valuable one. If the prints end up being infeasibly slow, it might be worth experimenting with disabling all asserts in Casting.h and checking performance.
,
Nov 21
Agreed that the cast one is probably the most popular, but it might not be the most expensive. I'll try to find out. Anyway, here's the top ten most popular (full list attached) asserts. The first column is the number of times that assert was evaluated while compiling stroke_opacity_custom.cc. Again, this is a front-end heavy benchmark: lots of parsing, but not much codegen. 576907824 ASSERT#88402 /work/llvm.combined/llvm/include/llvm/Support/Casting.h:106 350559480 ASSERT#1068 /work/llvm.combined/llvm/tools/clang/include/clang/AST/Type.h:658 197775902 ASSERT#88406 /work/llvm.combined/llvm/include/llvm/Support/Casting.h:255 122524451 ASSERT#86819 /work/llvm.combined/llvm/include/llvm/ADT/PointerUnion.h:136 106653566 ASSERT#86813 /work/llvm.combined/llvm/include/llvm/ADT/PointerIntPair.h:163 80207662 ASSERT#86814 /work/llvm.combined/llvm/include/llvm/ADT/PointerIntPair.h:170 58225295 ASSERT#86875 /work/llvm.combined/llvm/include/llvm/ADT/SmallVector.h:68 53971765 ASSERT#86700 /work/llvm.combined/llvm/include/llvm/ADT/DenseMap.h:1201 39525342 ASSERT#86693 /work/llvm.combined/llvm/include/llvm/ADT/DenseMap.h:619 26026724 ASSERT#1069 /work/llvm.combined/llvm/tools/clang/include/clang/AST/Type.h:840
,
Nov 21
Disabling the top 5 asserts above reduces compile time to 8.489 s +- 2%, so about 11% lower compile time. Even if they look cheap, it adds up. #ifndef ASSERT_HACK_H #define ASSERT_HACK_H #include <stdio.h> #define MYASSERTENABLED(nbr) (nbr != 88402 && nbr != 1068 && nbr != 88406 && nbr != 86819 && nbr != 86813) #define MYASSERT(nbr, expr) assert(!MYASSERTENABLED(nbr) || (expr)) #endif
,
Nov 23
Trying to quantify the impact of the 10 moste popular asserts. All times are in seconds, and from running with "perf stat -r 10". (Invocation attached in case I forget it.) Baseline: All asserts enabled: 9.621411181 ( +- 1.64% ) #88402 disabled: 9.215765794 ( +- 1.46% ) #1068 disabled: 9.482055662 ( +- 1.73% ) # 88406 disabled: 9.286999028 ( +- 1.54% ) # 86819 disabled: 9.354134688 ( +- 0.80% ) # 86813 disabled: 9.301074632 ( +- 0.89% ) # 86814 disabled: 9.506684206 ( +- 1.12% ) # 86875 disabled: 9.489970763 ( +- 0.90% ) # 86700 disabled: 9.743738074 ( +- 2.37% ) # 86693 disabled: 9.850024420 ( +- 1.62% ) # 1069 disabled: 9.749311255 ( +- 1.76% )
,
Nov 23
Putting the top-10 popular asserts aside, bisecting to try and find expensive asserts among the rest.. #define TOP10(nbr) (nbr == 88402 || nbr == 1068 || nbr == 88406 || nbr == 86819 || nbr == 86813 || nbr == 86814 || nbr == 86875 || nbr == 86700 || nbr == 86693 || nbr == 1069) #define MINASSERT 1 #define MAXASSERT 110000 #define MYASSERTENABLED(nbr) (!TOP10(nbr) && nbr >= MINASSERT && nbr <= MAXASSERT) #define MYASSERT(nbr, expr) assert(!MYASSERTENABLED(nbr) || (expr)) 1-110000 (all): 8.600611662 ( +- 2.17% ) 1-50000: 8.007742816 ( +- 0.69% ) 50001-110000: 7.528755192 ( +- 0.89% ) We're looking for the smallest number of asserts that give the largest slowdown, so continuing: 1-25000: 8.052880398 ( +- 0.87% ) 25001-50000: 7.547017843 ( +- 1.25% ) 1-12500: 8.042604032 ( +- 1.35% ) 1-6250: 7.954258778 ( +- 0.72% ) 1-3125: 7.864981617 ( +- 0.74% ) 3126-6250: 7.429988534 ( +- 0.83% ) 1-1562: 7.756856731 ( +- 0.97% ) 1563-3125 7.542954208 ( +- 0.77% ) 1-781: 7.530727438 ( +- 0.81% ) 782-1562: 7.566241924 ( +- 0.70% ) Bah, none of those two branches are standing out...
,
Dec 4
I noticed an expensive assertion while profiling a particularly slow compilation in LLVM (AArch64InstPrinter.cpp) that's not listed above, it's the llvm::LiveRange::verify() method in CodeGen. I only profiled the compile step for 36s, but the the compiler spent 2.3s of time in that verify method, so 6.4%. Hypothetically this is something that could be moved to EXPENSIVE_CHECKS, since I believe it changes the asymptotic performance of llvm::LiveIntervals::HMEditor::updateRange.
,
Dec 4
Thanks for the note! Yes, EXPENSIVE_CHECKS sounds reasonable for that. So far I've focused on a frontend-heavy benchmark, so I haven't seen much overhead from non-Clang stuff. I will look at something backend heavy in a second phase.
,
Dec 4
> 1-781: 7.530727438 ( +- 0.81% ) > 782-1562: 7.566241924 ( +- 0.70% ) These are pretty even, so let's go down both paths.. 1-390: 7.454701873 ( +- 0.90% ) 390-781: 7.617347650 ( +- 1.16% ) 390-585: 7.455154669 ( +- 0.86% ) 586-781: 7.422630911 ( +- 0.70% ) These are very even, and we're also very close to the 7.40 s time of disabling all asserts. There doesn't seem to be any one particularly expensive assert in this range. Let's try the other branch... 782-1172: 7.410517979 ( +- 0.86% ) 1173-1562: 7.525172850 ( +- 1.08% ) 1173-1367: 7.481780712 ( +- 0.83% ) 1368-1562: 7.447954251 ( +- 0.25% ) Again, doesn't look like there's any one expensive assert hiding here.
,
Dec 4
Looking at just the top-10 asserts from #22, these five are the most important: # 88402: "isa<> used on a null pointer" # 88406 "cast<Ty>() argument of incompatible type!" # 86813 "Pointer is not sufficiently aligned" in PointerIntPair # 86819 "Invalid accessor called" in PointerUnion # 1068 "Cannot retrieve a NULL type pointer" in QualType. Disabling these takes the compile-time from 9.41 s to 8.39 s, an 11% reduction. Doing the rest in the top-10 doesn't help much. These asserts are good in that they provide clear messages to developers when some invariant doesn't hold, but they're low-value in terms of finding bugs that would otherwise go unnoticed. For example, code doing isa<> on a null pointer is likely to crash soon anyway. These aren't really expensive checks, so ENABLE_EXPENSIVE_CHECKS isn't appropriate. But what to call them... I want a flag name that conveys that they're low-value in terms of finding bugs, and high cost because their costs add up, and therefore not suitable for running in production..
,
Dec 4
brainstorming.. ENABLE_LOW_VALUE_CHECKS ENABLE_LOW_LEVEL_CHECKS ENABLE_DEV_CHECKS ENABLE_COSTLY_CHECKS (that's a lot like EXPENSIVE_CHECKS) ENABLE_VERY_FREQUENT_CHECKS so far I like the last one the most..
,
Dec 4
I might making it up, but I thought I had seen instances of # 88406 -- yeah, https://bugs.chromium.org/p/chromium/issues/list?can=1&q=%22argument+of+incompatible+type%21%22&colspec=ID+Pri+M+Stars+ReleaseBlock+Component+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=ids (the lld bug in particular). # 88402 has a bunch of hits too: https://bugs.chromium.org/p/chromium/issues/list?can=1&q=%22isa%3C%3E+used+on+a+null+pointer%22&colspec=ID+Pri+M+Stars+ReleaseBlock+Component+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=ids I suppose the argument here is that these would've crashed elsewhere soon later. So these two do seem kind of useful. ENABLE_CONVENIENCE_CHECKS ENABLE_INCONSEQUENTIAL_CHECKS
,
Dec 4
...I think we discussed re-measuring the assert overhead in an pgo-built clang. Maybe we should do that first?
,
Dec 4
I gave this a try using george's instructions on http://llvm.org/docs/HowToBuildWithPGO.html TL;DR: PGO doesn't remove the overhead of asserts. I used Chromium's current (346388) to build an instrumented clang, train it on stroke_opacity_custom.cc, use that profile to build a PGO'ed clang, and then build stroke_opacity_custom.cc with "perf stat -r10": 8.252382043 seconds time elapsed ( +- 0.82% ) Re-doing the same steps as above but with asserts disabled: 6.859997228 seconds time elapsed ( +- 0.59% )
,
Dec 5
I collected numbers for popular asserts in a more backend-heavy test case last night: obj/v8/v8_base/objects.obj which takes 23 s to build, of which 17 s seems to be backend stuff. Here are the top-10: (attaching the full list) 1424306656 ASSERT#88402 /work/llvm.combined/llvm/include/llvm/Support/Casting.h:106 504016746 ASSERT#88406 /work/llvm.combined/llvm/include/llvm/Support/Casting.h:255 314222680 ASSERT#86700 /work/llvm.combined/llvm/include/llvm/ADT/DenseMap.h:1201 278355549 ASSERT#1068 /work/llvm.combined/llvm/tools/clang/include/clang/AST/Type.h:658 228280517 ASSERT#86693 /work/llvm.combined/llvm/include/llvm/ADT/DenseMap.h:619 166258617 ASSERT#86966 /work/llvm.combined/llvm/include/llvm/ADT/ilist_iterator.h:140 164898163 ASSERT#86875 /work/llvm.combined/llvm/include/llvm/ADT/SmallVector.h:68 151828707 ASSERT#86813 /work/llvm.combined/llvm/include/llvm/ADT/PointerIntPair.h:163 135581303 ASSERT#86708 /work/llvm.combined/llvm/include/llvm/ADT/DenseMap.h:1244 135581303 ASSERT#86707 /work/llvm.combined/llvm/include/llvm/ADT/DenseMap.h:1242
,
Dec 10
Casting.h:106 and Casting.h:255 we know from #20.
It's interesting to see AST/Type.h:658 here too, even though it's more backend-heavy.
The rest is ADT stuff.
DenseMap.h:1201 checks isHandleInSync() on DenseMapIterator creation. Disabling LLVM_ENABLE_ABI_BREAKING_CHECKS would disable this kind of checks.
DenseMap.h:619 guards against inserting empty or tombstone markers.
ilist_iterator.h:140 guards against dereferencing sentinel nodes
SmallVector.h:68 is not an index range check as one might have guessed, but this:
/// Set the array size to \p N, which the current array must have enough
/// capacity for.
///
/// This does not construct or destroy any elements in the vector.
///
/// Clients can use this in conjunction with capacity() to write past the end
/// of the buffer when they know that more elements are available, and only
/// update the size later. This avoids the cost of value initializing elements
/// which will only be overwritten.
void set_size(size_t Size) {
assert(Size <= capacity());
this->Size = Size;
}
That gets called for each push_back() and pop_back(). In the push_back() case, we've actually already done a range check before (to see if we need to grow). For pop_back(), we're decreasing the size so it should be trivially true. It would be interesting to see if the optimizer can reason about this.
PointerIntPair.h:163 checks for sufficient alignment.
DenseMap.h:1244 and DenseMap.h:1242 both deal with iterator invalidation and would go away with LLVM_ENABLE_ABI_BREAKING_CHECKS
,
Dec 10
> SmallVector.h:68 > In the push_back() case, we've actually already done a range check before (to see if we need to grow). For pop_back(), we're decreasing the size so it should be trivially true. It would be interesting to see if the optimizer can reason about this. Actually, the optimizer can't know that, but we can. I tried simplifying the code, just incrementing/decrementing this->Size manually, but it didn't make much difference.
,
Dec 10
All asserts: 22.966180185 ( +- 0.57% ) Without #88402: 21.696298509 ( +- 0.72% ) Without #88406: 22.008134786 ( +- 1.82% ) Without #86700: 22.687493661 ( +- 0.64% ) Without #1068: 22.490215674 ( +- 0.58% ) More tomorrow.
,
Dec 11
Without #86693: 22.799146921 ( +- 0.67% ) Without #86966: 22.672121689 ( +- 0.32% ) Without #86875: 22.686026603 ( +- 0.66% ) Without #86813: 22.363531746 ( +- 0.55% ) Without #86708: 22.901966730 ( +- 0.71% ) Without #86707: 22.782061841 ( +- 0.67% ) With abi-changing asserts disabled -DLLVM_ABI_BREAKING_CHECKS=FORCE_OFF: 22.564221803 ( +- 0.54% ) With all regular asserts disabled but still running #ndebug code, i.e. verifyers and such: 18.711125481 ( +- 0.53% ) With -DLLVM_ENABLE_ASSERTIONS=OFF: 17.214549407 ( +- 0.56% )
,
Dec 11
So it seems we have 1.5 s overhead from non-asserts #ifndef NDEBUG code. What are some of the more interesting uses.. It enables LLVM_ENABLE_STATS, which probably doesn't cost much. It enables definitions of some dump() methods and verify/validate methods. LoopInfoImpl.h has some verification LiveInterval has too there's AssertingVH etc. It makes llvm_unreachable() become an optimization hint GenericDomTreeConstruction.h has a check that's O(n) InsturctionPrecedenceTracking.cpp has some validation stuff LazyCallGraph has a lot of validation going on MemorySSA has some IfConversion has a little RegisterPressure has "Check if the alternate algorithm yields the same result" maybe that's expensive ScheduleDAG does some checking SelectionDAG does some checking as it goes along There are some small checks in IR/ and in Target/ ... And of course the big one, in non-asserts builds, Clang schedules the IR verifier to run after building the module.
,
Dec 11
all asserts enabled, but passing in -disable-llvm-verifier 22.331279927 ( +- 0.26% ) so not all that expensive overall.
,
Dec 11
I think the conclusion I'm arriving at is that we should build without asserts in our packages. Originally I'd hoped to find some smoking gun assert (or small set of them) that were particularly inefficient and could be disabled or made efficient to bring the overhead down to something more acceptable, like 10% instead of the current 25%. But the measurements show that there is no smoking gun. The most expensive assert is the null-check in isa<>(). We could invent a new category for that -- "high cost, only really to enhance llvm dev experience" -- and disable it along with similar asserts. But that would only buy us 5-ish percent or so, at the cost of the complexity of having this new assert category. Also it's not clear what else to put in there. Besides the isa<> null-check, not many others qualify. The cast<>() compatible types check, the PointerIntPair alignment check and SmallVector range checks are all set up to catch real bugs and not just developer luxuries. I think it would be hard to build a case upstream for having this separate assert category just so we can disable some asserts while keeping the rest, and I don't think it would really be worth it. The reason we run with asserts enabled in production is to mitigate risks from using a version so close to tip-of-tree. The nightmare scenario would be if Chromium was miscompiled by Clang, and we didn't notice because we'd disabled the asserts. That's unlikely though: Chromium has rigorous testing, and also our tip-of-tree bots would still have asserts and raise the error. The more realistic scenario is that a Chromium developer triggers some bug in Clang locally, and instead of getting an assert gets miscompiled code or some strange compiler behaviour. We see developers hitting Clang asserts roughly on a weekly basis. However, it is much more likely that Clang would crash rather than silently miscompile. And once the developer's code is checked in, our tip-of-tree builders would assert on it. Another comparison point is that Google uses close-to-trunk Clang without asserts internally and seems to do fine. With the knowledge that we have tip-of-tree bots building *all* the code in all the configurations we care about (we should add some more) with asserts enabled, I think disabling the asserts for the production clang packages might be the right trade-off.
,
Dec 11
This reasoning seems good to me.
,
Dec 12
> This reasoning seems good to me. Thanks. I'm gathering more data and will put together a doc. Comparing local build times of the 'chrome' target without and with assertions: (chromium e36b86e5258e2d30e37bb58e6512ca5de35a9f3b, gn args: is_debug = false is_component_build = true use_goma = false) with 346388-6 (no asserts) $ rm -f out/release/.ninja_log && time ninja -C out/release chrome ninja: Entering directory `out/release' [38022/38022] LINK ./chrome real 34m38.681s user 1823m49.196s sys 76m44.174s $ cp out/release/.ninja_log /work/ninja_log_noasserts (re-sync to 346355-5 (with asserts) $ rm -f out/release/.ninja_log && time ninja -C out/release chrome ninja: Entering directory `out/release' [1/1] Regenerating ninja files [38022/38022] LINK ./chrome real 41m53.212s user 2215m0.591s sys 79m43.019s $ cp out/release/.ninja_log /work/ninja_log_asserts So the asserts are increasing compile time by 21.5% wall-clock time or about 6.5 cpu hours
,
Dec 12
Is that min-of-3 each, or something comparable? Just building twice, once in each config, in my experience always gives the second build a huge boost, due to *handwave* caching.
,
Dec 12
No, just one run of each, in the order they're listed. I ran the no-asserts build first, so the asserts-enabled build has the advantage of a warm cache but is still a lot slower. If you think we need more rigorous measurements, I can build some more overnight.
,
Dec 12
Oh sorry, for some reason I misread comment 41 and thought no asserts was second. If we have a 21.5% speedup with no asserts at a disadvantage, that's good enough for me.
,
Jan 17
(5 days ago)
For those following along, we now have sign-off for doing disabling the asserts, but before flipping the switch I want to 1) Increase the ToT bot coverage. It's currently very good on Windows, but I want to ensure it's comprehensive on the other platforms too. 2) Write some kind of system to make it easier to monitor the bots. In particular, it should be easy to answer "What did the bots think about Clang revision X", "What's the latest Clang revision that the bots liked", or "What are the current issues with ToT Clang." We've got the data, but it's not accessible. |
|||
►
Sign in to add a comment |
|||
Comment 1 by dpranke@chromium.org
, Oct 17