Win ASan LLD builds fail check in asan_globals.cc |
||||||
Issue descriptionhttps://ci.chromium.org/buildbot/chromium.lkgr/Win%20ASan%20Release/?limit=200 Starting since https://ci.chromium.org/buildbot/chromium.lkgr/Win%20ASan%20Release/14324 Error: [2783/35642] ACTION //components/resources:compressed_about_credits(//build/toolchain/win:win_clang_x64) FAILED: gen/components/resources/about_credits.bro C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/gn_run_binary.py brotli.exe --force --no-copy-stat gen/components/resources/about_credits.html -o gen/components/resources/about_credits.bro ==7968==AddressSanitizer CHECK failed: C:\b\rr\tmpm5zxwy\w\src\third_party\llvm\projects\compiler-rt\lib\asan\asan_globals.cc:382 "((globals[i].size == 0 && globals[i].size_with_redzone == 0 && globals[i].name == nullptr && globals[i].module_name == nullptr && globals[i].odr_indicator == 0)) != (0)" (0x0, 0x0) #0 0x13ff0925f in __asan::AsanCheckFailed C:\b\rr\tmpm5zxwy\w\src\third_party\llvm\projects\compiler-rt\lib\asan\asan_rtl.cc:69 #1 0x13ff1ad7b in __sanitizer::CheckFailed C:\b\rr\tmpm5zxwy\w\src\third_party\llvm\projects\compiler-rt\lib\sanitizer_common\sanitizer_termination.cc:79 #2 0x13ff328c1 in __asan_register_globals C:\b\rr\tmpm5zxwy\w\src\third_party\llvm\projects\compiler-rt\lib\asan\asan_globals.cc:380 #3 0x13ff3faf6 in _initterm minkernel\crts\ucrt\src\appcrt\startup\initterm.cpp:21 #4 0x13ff3bb45 in __scrt_common_main_seh f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:251 #5 0x76de59ec in BaseThreadInitThunk+0xc (C:\Windows\system32\kernel32.dll+0x78d359ec) #6 0x76f1b370 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x78e7b370) brotli.exe failed with exit code 1 There are only 3 changes in the blamelist, https://chromium-review.googlesource.com/c/chromium/src/+/983632 seems to be the culprit
,
Apr 26 2018
I don't know.
,
Apr 26 2018
,
Apr 26 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/fda7b0b62df2fe87ca1650907911ef66a051faa8 commit fda7b0b62df2fe87ca1650907911ef66a051faa8 Author: Hans Wennborg <hans@chromium.org> Date: Thu Apr 26 03:13:25 2018 Revert "Reland "win: Link with lld instead of MSVC's link.exe by default"" This reverts commit 6b5ac2efbfcbf98590bb1b7e610dacb3318c122a. Reason for revert: It broke the Win ASan bot on the LKGR waterfall (see second bug). Original change's description: > Reland "win: Link with lld instead of MSVC's link.exe by default" > > This is a reland of be8138a4eb66bddb5bca81e08c13c6cb1dafc981 > The issues with the previous attempt were addressed in the latest > Clang roll (#553415). > > Original change's description: > > win: Link with lld instead of MSVC's link.exe by default > > > > lld is LLVM's linker. It produces PE/COFF and PDB files just like > > link.exe, but it's significantly faster and it can also handle LLVM's > > internal representation, which will enable us to do link-time > > optimization and control-flow integraty checks with Clang. > > > > While lld is much faster at linking, it doesn't support incremental > > links, meaning builds that only touch a few files and re-link a large > > executable may become slower. > > > > This is the first attempt at switching everything over, with the > > purpose of gathering data and finding unknown unknowns. It's likely > > temporary until something breaks. > > > > is_win_fastlink is implicitly ignored when using lld, as lld without > > fastlink is faster than link.exe with it. > > > > Also switch the CrWinClangLLD bots on chromium.clang to use MSVC's > > link.exe to make sure that configuration keeps working. > > > > Bug: 792131 > > Change-Id: I0f115a78c33d69eadbd480f75c2a5d636e86483d > > Reviewed-on: https://chromium-review.googlesource.com/983632 > > Commit-Queue: Nico Weber <thakis@chromium.org> > > Reviewed-by: Nico Weber <thakis@chromium.org> > > Cr-Commit-Position: refs/heads/master@{#551367} > > Bug: 792131 > Change-Id: I3b76cd015ef18bb3e2ac7a3efa4352c4f04e560b > Reviewed-on: https://chromium-review.googlesource.com/1028374 > Reviewed-by: Nico Weber <thakis@chromium.org> > Commit-Queue: Hans Wennborg <hans@chromium.org> > Cr-Commit-Position: refs/heads/master@{#553776} TBR=thakis@chromium.org,hans@chromium.org Change-Id: Ied5addd2f8d16d1e002bf37846e7106b4c754cb9 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 792131 , 837090 Reviewed-on: https://chromium-review.googlesource.com/1029631 Reviewed-by: Hans Wennborg <hans@chromium.org> Commit-Queue: Hans Wennborg <hans@chromium.org> Cr-Commit-Position: refs/heads/master@{#553878} [modify] https://crrev.com/fda7b0b62df2fe87ca1650907911ef66a051faa8/build/config/compiler/compiler.gni [modify] https://crrev.com/fda7b0b62df2fe87ca1650907911ef66a051faa8/tools/mb/mb_config.pyl
,
Apr 26 2018
Hm, not clear this warranted a revert. Worst case, we could've disabled lld in asan builds, but since this bot exists only on lkgr I'm not sure even that's appropriate, we could've maybe just fixed forward.
,
Apr 26 2018
Thanks a lot Hans, the build is green now!
,
Apr 26 2018
> Hm, not clear this warranted a revert. Rats, I didn't think about just disabling for ASan :-/ It looked like a real issue though, so seemed worth investigating offline. mmoroz: While the build was fixed by the revert, I'll re-open and use this to track the underlying problem.
,
Apr 26 2018
This is the code where we fail the check:
void __asan_register_globals(__asan_global *globals, uptr n) {
if (!flags()->report_globals) return;
GET_STACK_TRACE_MALLOC;
u32 stack_id = StackDepotPut(stack);
BlockingMutexLock lock(&mu_for_globals);
if (!global_registration_site_vector)
global_registration_site_vector =
new(allocator_for_globals) GlobalRegistrationSiteVector(128);
GlobalRegistrationSite site = {stack_id, &globals[0], &globals[n - 1]};
global_registration_site_vector->push_back(site);
if (flags()->report_globals >= 2) {
PRINT_CURRENT_STACK();
Printf("=== ID %d; %p %p\n", stack_id, &globals[0], &globals[n - 1]);
}
for (uptr i = 0; i < n; i++) {
if (SANITIZER_WINDOWS && globals[i].beg == 0) {
// The MSVC incremental linker may pad globals out to 256 bytes. As long
// as __asan_global is less than 256 bytes large and its size is a power
// of two, we can skip over the padding.
static_assert(
sizeof(__asan_global) < 256 &&
(sizeof(__asan_global) & (sizeof(__asan_global) - 1)) == 0,
"sizeof(__asan_global) incompatible with incremental linker padding");
// If these are padding bytes, the rest of the global should be zero.
CHECK(globals[i].size == 0 && globals[i].size_with_redzone == 0 &&
globals[i].name == nullptr && globals[i].module_name == nullptr &&
globals[i].odr_indicator == 0);
continue;
}
RegisterGlobal(&globals[i]);
}
// Poison the metadata. It should not be accessible to user code.
PoisonShadow(reinterpret_cast<uptr>(globals), n * sizeof(__asan_global),
kAsanGlobalRedzoneMagic);
}
If I understand correctly, this gets called for all pointers between __asan_globals_start and __asan_globals_end.
Looking at the /lldmap file for brotli.exe:
Address Size Align Out In Symbol
001b8cf0 00000000 16 d:\rs2.obj.amd64fre\minkernel\crts\ucrt\src\appcrt\dll\mt\..\..\tran\mt\objfre\amd64\log_f_inv_qword_table.obj:(.data)
001b8cf0 00000040 8 projects\compiler-rt\lib\asan\CMakeFiles\RTAsan.x86_64.dir\asan_globals_win.cc.obj:(.ASAN$GA)
001b8cf0 00000000 0 __asan_globals_start
001b8d40 00000040 64 obj/third_party/brotli/brotli/backward_references_hq.obj:(.ASAN$GL)
001b8d80 00000040 64 obj/third_party/brotli/brotli/backward_references_hq.obj:(.ASAN$GL)
001b8dc0 00000040 64 obj/third_party/brotli/brotli/backward_references_hq.obj:(.ASAN$GL)
It seems like the first global doesn't actually start at __asan_globals_start, but a few bytes after due to alignment.
Maybe what's happening is that __asan_register_globals() starts with 0x001b8cf0 and sees globals[i].beg == 0 because it's reading padding before the first global..
,
Apr 26 2018
Looks like this is an x64 asan bot, which isn't as well supported as x86, which is what I tried first to reproduce this.
,
Apr 26 2018
Okay so the problem is that __asan_globals_start is only 8-byte aligned, but the others are 64-byte aligned. The over-alignment comes from here:
AddressSanitizerModule::InstrumentGlobalsCOFF():
// The MSVC linker always inserts padding when linking incrementally. We
// cope with that by aligning each struct to its size, which must be a power
// of two.
unsigned SizeOfGlobalStruct = DL.getTypeAllocSize(Initializer->getType());
assert(isPowerOf2_32(SizeOfGlobalStruct) &&
"global metadata will not be padded appropriately");
Metadata->setAlignment(SizeOfGlobalStruct);
I wonder if we can use a pragma or something to increase the alignment of __asan_globals_start
,
Apr 26 2018
I agree, in order for this nonsense to work, __asan_globals_start needs to be aligned to sizeof(__asan_global) to match the .ASAN$GL bits.
,
Apr 26 2018
OK, we have a fix upstream in r330990. Also, now the ASan runtime has full debug info instead of just line tables, so debugging these issues should be easier in the future. In the short term, I think Hans wanted to reland the LLD switch with an exception for is_asan=true builds. In the long term, we'll roll clang, pick up this fix, and remove the exception.
,
Apr 26 2018
Verified locally that r330990 fixes the check when invoking brotli.
,
Apr 26 2018
Why doesn't link.exe need the compiler-rt change? Do they align things less? If so, why are we different?
,
Apr 26 2018
Looking at the map file from using link.exe: Start Length Name Class 0002:000a4334 00000014H .idata$2 DATA 0002:000a4348 00000018H .idata$3 DATA 0002:000a4360 00000358H .idata$4 DATA 0002:000a46b8 00000832H .idata$6 DATA 0003:00000000 00000040H .ASAN$GA DATA 0003:00000040 00001c00H .ASAN$GL DATA 0003:00001c40 00000040H .ASAN$GZ DATA 0003:00001c80 00002d40H .data DATA 0003:000049b0 00000000H .data$r DATA 0003:000049c0 008d18e8H .bss DATA 0004:00000000 00003abcH .pdata DATA 0005:00000000 00000004H .SCOV$A DATA Address Publics by Value Rva+Base Lib:Object 0003:00000000 __asan_globals_start 00000001401bd000 clang_rt.asan-x86_64:asan_globals_win.cc.obj 0003:00001c40 __asan_globals_end 00000001401bec40 clang_rt.asan-x86_64:asan_globals_win.cc.obj It looks like .ASAN$GA goes first in the .data segment (due due alphabetical ordering?) and so __asan_globals_start gets page aligned. The fix to compiler-rt still makes sense, but maybe lld also should adjust which order it merges these "sub sections" to match link.exe? rkn, ruiu: ^
,
Apr 26 2018
,
May 10 2018
> It looks like .ASAN$GA goes first in the .data segment (due due alphabetical ordering?) and so __asan_globals_start gets page aligned. > > The fix to compiler-rt still makes sense, but maybe lld also should adjust which order it merges these "sub sections" to match link.exe? I agree, LLD should simply do a stable sort on the input section name before splitting on the '$'. I doubt anyone relies on this, but it's reasonable behavior and if users care they can name their sections .zzzdbg or something to move them to the end of the output section.
,
May 24 2018
I looked at the ordering stuff today. We're already processing the sections in alphabetical order in createSections(), (except that "standard" sections like .text, .bss, etc. are created first so they will be emitted first). So if there's a .zzzdbg, it will come last (modulo .rsrc and .reloc). Instead, it seems the difference is how .ASAN gets merged into .data (which happens because of a /MERGE flag that comes from compiler-rt/lib/asan/asan_globals_win.cc:25:#pragma comment). It seems that link.exe merges it to the beginning of .data, but lld puts it at the end. Putting it at the end makes sense, and is less work since it's appending to an array, so I don't think we want to change that. One idea might be to do the "merging" earlier, when chunks are put into sections in the first place. That way the .ASAN chunks would go into .data before the regular .data chunks. I'm not sure that's a good idea though, maybe pcc and ruiu can comment. Marking fixed since I don't think there's more to do here. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by thakis@chromium.org
, Apr 26 2018