LKGR Windows ASan build is crashing on a startup since 510178:510268 |
||||||||||||
Issue descriptionhttps://pantheon.corp.google.com/logs/viewer?project=google.com%3Aclusterfuzz&minLogLevel=500&expandAll=false&resource=gce_instance×tamp=2017-10-21T04%3A38%3A32.000000000Z&dateRangeStart=2017-10-21T03%3A38%3A33.353Z&dateRangeEnd=2017-10-21T04%3A38%3A33.353Z&interval=PT1H&advancedFilter=resource.type%3D%22gce_instance%22%0Aresource.labels.zone%3D%22us-central1-f%22%0Aresource.labels.project_id%3D%22google.com%3Aclusterfuzz%22%0Aresource.labels.instance_id%3D%224846894608969394503%22%0Atimestamp%3D%222017-10-21T04%3A38%3A18.000000000Z%22%0AinsertId%3D%22nbk9myg1dzb1h6%22 Faulting application name: chrome.exe, version: 64.0.3246.0, time stamp: 0x59eabe13 Faulting module name: chrome_elf.dll, version: 64.0.3246.0, time stamp: 0x59eabb85 Exception code: 0x4000001f Fault offset: 0x000e1a87 Faulting process id: 0xbf4 Faulting application start time: 0x01d34a266a1ce7a0 Faulting application path: c:\clusterfuzz\slave-bot\builds\chrome-test-builds_media_win32-release_e999b74784b15cd4b248500a1cd5cd3743d01362\revisions\asan-win32-release-510633\chrome.exe Faulting module path: c:\clusterfuzz\slave-bot\builds\chrome-test-builds_media_win32-release_e999b74784b15cd4b248500a1cd5cd3743d01362\revisions\asan-win32-release-510633\chrome_elf.dll Report Id: a7d09adb-b619-11e7-80e2-e33582d73f46 Faulting package full name: Faulting package-relative application ID:
,
Oct 21 2017
This seems to be one of the first occurrences: https://pantheon.corp.google.com/logs/viewer?project=google.com:clusterfuzz&minLogLevel=500&expandAll=false&resource=gce_instance×tamp=2017-10-17T23:51:14.000000000Z&dateRangeStart=2017-10-17T17:54:25.000Z&dateRangeEnd=2017-10-18T05:54:25.000Z&interval=JUMP_TO_TIME&filters=text:%22Faulting%20application%20name%22&advancedFilter=resource.type%3D%22gce_instance%22%0Aresource.labels.instance_id%3D%228748998373071153196%22%0Aresource.labels.zone%3D%22us-central1-f%22%0Aresource.labels.project_id%3D%22google.com:clusterfuzz%22%0Atimestamp%3D%222017-10-17T23:51:14.000000000Z%22%0AinsertId%3D%22ldl1l7g1yzbnqt%22 But I cannot find a build that produced: asan-win32-release-510268 https://build.chromium.org/p/chromium.lkgr/builders/Win%20ASan%20Release%20Media/builds/5591 produced asan-win32-release-510284 and https://build.chromium.org/p/chromium.lkgr/builders/Win%20ASan%20Release%20Media/builds/5590 produced asan-win32-release-510178
,
Oct 21 2017
,
Oct 21 2017
Ah, that was the wrong buildbot. The build came from https://build.chromium.org/p/chromium.lkgr/builders/Win%20ASan%20Release/builds/7009
,
Oct 21 2017
r510178 was the latest good build we had
,
Oct 21 2017
,
Oct 21 2017
Here's the log of changes: https://chromium.googlesource.com/chromium/src/+log/866a0da..cad2e24
,
Oct 21 2017
No idea what is causing the crash...
,
Oct 22 2017
Abhishek suggested over email that https://chromium.googlesource.com/chromium/src/+/0c0ade8e414850c6cf64fc724344c6ba350cbe6d could be the root cause
,
Oct 22 2017
https://chromium.googlesource.com/chromium/src/+/0c0ade8e414850c6cf64fc724344c6ba350cbe6d can't be to blame for sure, it just adds a tool that's not used by anything. https://chromium.googlesource.com/chromium/src/+/82f6080f7d0e71eabce5af2e363d375db323bea7 did hook it up, but it compares the output of rc.py to that of rc.exe (which was used before) and makes sure it's bitwise identical, which proves that the change is behavior-preserving on Windows. So I'm 99% sure that's not it.
,
Oct 22 2017
Oh the "hook up" cl isn't even in https://chromium.googlesource.com/chromium/src/+log/866a0da..cad2e24 -- so yes, that's not it. That just adds a tool not called by anything yet.
,
Oct 23 2017
Thanks Nico for taking a look.
,
Oct 24 2017
assigning to myself as this week's sheriff..
,
Oct 25 2017
Looks like it's dying here: (1ab8.1c78): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=00000000 ebx=00000048 ecx=fffffffc edx=00000000 esi=6fd01dd8 edi=00000000 eip=77cadde3 esp=04fcf258 ebp=04fcf2b8 iopl=0 nv up ei pl nz ac po cy cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010213 ntdll!RtlpWaitOnCriticalSection+0xac: 77cadde3 ff4014 inc dword ptr [eax+14h] ds:002b:00000014=???????? 00 04fcf2b8 77cadc8b ntdll!RtlpWaitOnCriticalSection+0xac 01 04fcf2e4 77cadcb5 ntdll!RtlpEnterCriticalSectionContended+0xa0 *** WARNING: Unable to verify checksum for C:\clusterfuzz\slave-bot\builds\chromium-browser-asan_win32-release_a15986b6468dcac5c0ae9a2e0729fd95d8ce33c8\revisions\asan-win32-release-511225\chrome_elf.dll 02 04fcf2ec 6fc4a4fa ntdll!RtlEnterCriticalSection+0x42 03 04fcf2f8 6fc444cb chrome_elf!__acrt_lock+0x15 [minkernel\crts\ucrt\src\appcrt\internal\locks.cpp @ 55] 04 (Inline) -------- chrome_elf!__acrt_lock_and_call::__l2::<lambda_cbab9ec6f41b0180b23cc171c22676b0>::operator()+0xa [minkernel\crts\ucrt\inc\corecrt_internal.h @ 909] 05 04fcf330 6fc44679 chrome_elf!__crt_seh_guarded_call<void (__cdecl*)(int)>::operator()<<lambda_cbab9ec6f41b0180b23cc171c22676b0>,<lambda_44731a7d0e6d81c3e6aa82d741081786> &,<lambda_4b292cb8dd18144e164572427af410ab> >+0x1a [minkernel\crts\ucrt\devdiv\vcruntime\inc\internal_shared.h @ 201] 06 (Inline) -------- chrome_elf!__acrt_lock_and_call+0x1d [minkernel\crts\ucrt\inc\corecrt_internal.h @ 908] 07 04fcf350 6fc41cf8 chrome_elf!__acrt_get_sigabrt_handler+0x25 [minkernel\crts\ucrt\src\appcrt\misc\signal.cpp @ 542] 08 04fcf354 6fc348c2 chrome_elf!abort+0x5 [minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 62] 09 (Inline) -------- chrome_elf!__sanitizer::dllThunkGetRealAddrOrDie+0x1e [e:\b\build\slave\win_upload_clang\build\src\third_party\llvm\projects\compiler-rt\lib\sanitizer_common\sanitizer_win_dll_thunk.cc @ 30] 0a 04fcf360 6fc2f98f chrome_elf!__sanitizer::dllThunkIntercept+0x22 [e:\b\build\slave\win_upload_clang\build\src\third_party\llvm\projects\compiler-rt\lib\sanitizer_common\sanitizer_win_dll_thunk.cc @ 36] 0b 04fcf380 77cd96de chrome_elf!intercept___asan_get_free_stack+0xf [e:\b\build\slave\win_upload_clang\build\src\third_party\llvm\projects\compiler-rt\lib\asan\asan_interface.inc @ 32] 0c 04fcf3a0 77cd9658 ntdll!LdrxCallInitRoutine+0x16 0d 04fcf3f0 77d2d625 ntdll!LdrpCallInitRoutine+0x43 0e 04fcf430 77cabd89 ntdll!LdrpCallTlsInitializers+0x7de35 0f 04fcf4a8 77ce776d ntdll!LdrpInitializeNode+0x1b0 10 04fcf4cc 77ceb530 ntdll!LdrpInitializeGraph+0x59 11 04fcf4f0 77cedca9 ntdll!LdrpInitializeGraph+0x70 12 04fcf698 77cece65 ntdll!LdrpInitializeProcess+0x1614 13 04fcf6e8 77cea9d0 ntdll!_LdrpInitialize+0x90 14 04fcf6f0 00000000 ntdll!LdrInitializeThunk+0x10 rnk, any ideas what's going on here?
,
Oct 25 2017
Looks like InternalGetProcAddress is failing:
uptr dllThunkGetRealAddrOrDie(const char *name) {
uptr ret =
__interception::InternalGetProcAddress((void *)GetModuleHandleA(0), name);
if (!ret)
abort();
return ret;
}
So something may have caused __asan_get_free_stack to not be exported by chrome.exe, or otherwise modified the export table or PE headers.
,
Oct 25 2017
Looking at the export list from dumpbin shows that indeed it is not exported.
,
Oct 25 2017
The previous asan build (510178) *does* export __asan_get_free_stack from chrome.exe, so something changed in the way we link chrome.exe.
,
Oct 25 2017
One interesting change in that set is the upgrade to the VS 15.4 tools: https://chromium.googlesource.com/chromium/src/+/70155edf7d3f4ffa931adc3077968ee99b9462ac This upgrades the linker from version 14.11.25507.1 to 14.11.25547.0. I'm going to try linking some small examples with different linker versions to check if that's the issue.
,
Oct 25 2017
The upgrade to 15.4 would also have changed the CRT libraries, presumably, which could be the issue.
,
Oct 25 2017
The /wholearchive: flag doesn't seem to work in the new linker version. I compared the verbose output of the linker, and in the old version I see:
Starting pass 1
Processed /DEFAULTLIB:libcmt.lib
Processed /DEFAULTLIB:oldnames.lib
Loaded clang_rt.asan-i386.lib(ubsan_value.cc.obj)
Loaded clang_rt.asan-i386.lib(ubsan_handlers.cc.obj)
Loaded clang_rt.asan-i386.lib(ubsan_flags.cc.obj)
Loaded clang_rt.asan-i386.lib(ubsan_init.cc.obj)
Loaded clang_rt.asan-i386.lib(ubsan_diag.cc.obj)
This suggests that it loads all objects in the asan library unconditionally. The new linker output looks like:
Starting pass 1
Processed /DEFAULTLIB:libcmt.lib
Processed /DEFAULTLIB:oldnames.lib
Searching libraries
Searching C:\src\llvm-project\build_x86_stage1\lib\clang\6.0.0\lib\windows\\clang_rt.asan-i386.lib:
Found ___asan_init
Referenced in t.obj
Loaded clang_rt.asan-i386.lib(asan_rtl.cc.obj)
Found "void __cdecl __asan::AsanDeactivate(void)" (?AsanDeactivate@__asan@@YAXXZ)
Referenced in clang_rt.asan-i386.lib(asan_rtl.cc.obj)
Loaded clang_rt.asan-i386.lib(asan_activation.cc.obj)
I guess we can report that as a bug. For now we may be able to work around it with a few /include:__asan_* linker flags, but it's a hack.
,
Oct 25 2017
I tried to create a minimal repro of this and I failed. I created these two source files:
C:\src\code\repros\wholearchive>type main.cpp
#include <stdio.h>
int main() {
printf("I'm in main.\n");
}
C:\src\code\repros\wholearchive>type lib.cpp
#include <stdio.h>
class Foo {
public:
Foo() {
printf("Hello world - I am being globally constructed!\n");
}
};
Foo g_foo;
I then compiled both files, created a .lib file, and then linked main.exe twice:
cl /c main.cpp
cl /c lib.cpp
lib lib.obj /out:lib.lib
link main.obj lib.lib
main.exe
link main.obj /wholearchive:lib.lib
main.exe
The results from the two copies of main.exe were as expected if /wholearchive was working correctly:
I'm in main.
Hello world - I am being globally constructed!
I'm in main.
That doesn't mean that /wholearchive isn't broken, but it does make reporting the bug in a useful way more difficult.
,
Oct 25 2017
Bummer, I assumed it would be easy. You should be able to repro with the asan archive in Chromium right now. I did this in a 32-bit VC environment: $ cl -c main.cpp $ ../../third_party/depot_tools/win_toolchain/vs_files/88c3b62e1eb0893b8cd57e3f4859c3af27907f64/VC/Tools/MSVC/14.11.2 5503/bin/Hostx64/x86/link.exe main.obj -libpath:../../third_party/llvm-build/Release+Asserts/lib/clang/6.0.0/lib/windows/ -wholearchive:clang_rt.asan- i386.lib Microsoft (R) Incremental Linker Version 14.11.25547.0 Copyright (C) Microsoft Corporation. All rights reserved. $ dumpbin -exports main.exe | grep asan_get_free # not present
,
Oct 25 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cb2c3cc702d623e69ede3d7950b7af3afcb20ab6 commit cb2c3cc702d623e69ede3d7950b7af3afcb20ab6 Author: Reid Kleckner <rnk@google.com> Date: Wed Oct 25 22:47:25 2017 Work around VC 15.4 /wholearchive linker bug for ASan I randomly chose a symbol from the archive member that the linker is failing to load and included it in the link with /include. We can remove this extra linker flag when /wholearchive works again. Etienne added /wholearchive back in http://crrev.com/451836 R=ochang@chromium.org BUG= 777087 Change-Id: I39298f0a690e05bf6758d8dc12d569762d551ca1 Reviewed-on: https://chromium-review.googlesource.com/737867 Reviewed-by: Oliver Chang <ochang@chromium.org> Commit-Queue: Reid Kleckner <rnk@chromium.org> Cr-Commit-Position: refs/heads/master@{#511625} [modify] https://crrev.com/cb2c3cc702d623e69ede3d7950b7af3afcb20ab6/build/config/sanitizers/BUILD.gn
,
Oct 26 2017
It's actually a bug with relative libpath flags. I moved your example into a "libasan" directory and did this: C:\src\chromium\src\out\clang>link main.obj -libpath:libasan -wholearchive:lib.lib Microsoft (R) Incremental Linker Version 14.11.25547.0 Copyright (C) Microsoft Corporation. All rights reserved. C:\src\chromium\src\out\clang>main.exe Hello world - I am being globally constructed! I'm in main. C:\src\chromium\src\out\clang>link main.obj -libpath:./libasan -wholearchive:lib.lib Microsoft (R) Incremental Linker Version 14.11.25547.0 Copyright (C) Microsoft Corporation. All rights reserved. C:\src\chromium\src\out\clang>main.exe I'm in main. Our gn config uses relative paths, like this: /LIBPATH:../../third_party/llvm-build/Release+Asserts/lib/clang/6.0.0/lib/windows I guess that doesn't work with the new linker.
,
Oct 26 2017
Okay, I understand what is going on now. The slashes in the /LIBPATH argument are no longer supported path-separators in 15.4, and -wholearchive is ignoring the library that it can't find (old behavior exposed by the /LIBPATH change). Details: I experimented with having lib.lib in a subdirectory and quickly discovered a relevant quirk: -wholearchive doesn't care whether the specified library exists or not. For instance, this is totally fine: link main.obj -wholearchive:file_existence_errors_are_for_chumps.lib With the knowledge that missing -wholearchive libs are silently ignored I was able to find the root cause, which is that -libpath no longer supports slashes. It requires backslashes. Here's my modified test batch file (lib.cpp in a lib subdirectory and wholearchive cl /c main.cpp pushd lib cl /c lib.cpp lib lib.obj /out:lib.lib popd link main.obj lib\lib.lib main.exe link main.obj /libpath:./lib -wholearchive:lib.lib main.exe With the 15.3.2 toolchain (a9e1098bba66d2acccc377d5ee81265910f29272 hash) this works (the second invocation of main prints the constructor message) but with the 15.4 toolchain (88c3b62e1eb0893b8cd57e3f4859c3af27907f64 hash) it does not. I'll file a bug for -wholearchive silently ignoring missing files, but it's possible that this is by-design for some obscure reason. I *could* file a bug for /libpath not handling slashes, but I suspect it will be WontFixed because backslashes are the correct separators. Obviously this discovery opens up the possibility of alternate fixes - changing the slashes to backslashes. I filed a bug for /wholearchive silently ignoring missing .lib files: https://developercommunity.visualstudio.com/content/problem/139047/wholearchive-silently-ignores-missing-lib-files.html
,
Oct 26 2017
Please file a bug for slashes not working. It's a regression, and software relied on having both work (cf this bug). In most dev tooling pieces, both ways work.
,
Oct 27 2017
Jessie Wu [MSFT] changed the status of the following problem: "wholearchive silently ignores missing .lib files" to Fixed - Pending Release and added a comment. Thank you for your feedback! We have fixed the problem in an upcoming release. Thank you for helping us build a better Visual Studio! That at least means that this problem will no longer silently fail - it will become a build error. If they don't fix the slash/back-slash issue then we may need to stop using /wholearchive when the /wholearchive fix ships (which is not really a big sacrifice given that it is currently a NOP)
,
Oct 31 2017
They have fixed the slash/back-slash issue as well. "Thank you for your feedback! We have fixed the problem in an upcoming release. Thank you for helping us build a better Visual Studio!"
,
Nov 8 2017
Feel free to close once you have updated the toolchain.
,
Nov 8 2017
The crash should no longer be happening (due to the workarounds) but I'm going to leave this open until we have a fixed toolchain (15.5?) that avoids these issues. Note that we actually went back to 15.3 for different reasons, but that doesn't really count as "fixing" the issue. Assigning to myself so that I remember to close it if/when 15.5 fixes the underlying issues.
,
Nov 13 2017
The revert of the 15.4 toolchain has resolved this bug. I debated lowering the priority but decided that closing the bug made the most sense. I will test that /libpath handles slashes and that /wholearchive fails if libs are missing before we switch to the 15.5 toolchain. Here is the CL that resolved this bug: The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/db45606398cf4389bf332b0cdcffd04e7de4a4f6 commit db45606398cf4389bf332b0cdcffd04e7de4a4f6 Author: Bruce Dawson <brucedawson@chromium.org> Date: Fri Nov 03 23:48:27 2017 Revert "VS 2017 15.4 with 10.0.15063.468 SDK" This reverts commit 70155edf7d3f4ffa931adc3077968ee99b9462ac. Reason for revert: the 15.4 toolchain breaks incremental linking for Chrome ( crbug.com/780161 ) Original change's description: > VS 2017 15.4 with 10.0.15063.468 SDK > > This change switches the VS 2017 package to use VS 2017 Update 4 while > still using the 10.0.15063.468 SDK. Update 4 fixes at least one code-gen > bug ( crbug.com/759402 ) but the 10.0.16299.0 SDK is still incompatible > with Chrome ( crbug.com/773476 ). > > Packaging was done on a Windows Server 2016 VM, cleanly created for this > purpose. > > Compiler was packaged up by downloading VS 2017 Update 4, from > https://www.visualstudio.com/vs/, and then passing these parameters to > the installer: > > --add Microsoft.VisualStudio.Workload.NativeDesktop > --add Microsoft.VisualStudio.Component.VC.ATLMFC --includeRecommended > --passive > > Then the Windows 10.0.15063.468 SDK installer was run and everything was > installed except for the Windows Performance Toolkit, .Net Framework, > and the arm SDKs. > > Then the packaging script was run like this: > > python depot_tools\win_toolchain\package_from_installed.py 2017 -w 10.0.15063.0 > > Bug: 773476 , 759402 > Change-Id: Ie2176b5ff765d9e5497f51a7b00c02fad04fb973 > Reviewed-on: https://chromium-review.googlesource.com/727523 > Reviewed-by: Scott Graham <scottmg@chromium.org> > Commit-Queue: Bruce Dawson <brucedawson@chromium.org> > Cr-Commit-Position: refs/heads/master@{#510207} TBR=thakis@chromium.org,brucedawson@chromium.org,scottmg@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 773476 , 759402 Change-Id: Ica63e9740c954499b85ee891fb3bec0d48dd70fb Reviewed-on: https://chromium-review.googlesource.com/753422 Reviewed-by: Bruce Dawson <brucedawson@chromium.org> Reviewed-by: John Abd-El-Malek <jam@chromium.org> Commit-Queue: Bruce Dawson <brucedawson@chromium.org> Cr-Commit-Position: refs/heads/master@{#513965} [modify] https://crrev.com/db45606398cf4389bf332b0cdcffd04e7de4a4f6/build/vs_toolchain.py
,
May 25 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/add2efb3a194e081d759a51117e8fe863f1380cf commit add2efb3a194e081d759a51117e8fe863f1380cf Author: Reid Kleckner <rnk@google.com> Date: Fri May 25 22:37:43 2018 Drop redundant ASan library from exe link line This works around a bug (https://llvm.org/pr37592) in LLD, where if foo.lib appears on the link line without -wholearchive:, it doesn't take effect. Instead, only pass the -wholearchive:foo.lib flag, which adds the library as an input and turns on the wholearchive behavior that we need. Fixes base_unittests StackSamplingProfilerTest.OtherLibrary and related tests, which were failing because chrome.exe was missing an exported function. Remove the /include flag workaround for https://crbug.com/777087 . BUG= 777087 , 844398 R=mmoroz@chromium.org Change-Id: I493e1dcf6963048f7e83df1c937b4a4a62dd96bb Reviewed-on: https://chromium-review.googlesource.com/1073890 Reviewed-by: Max Moroz <mmoroz@chromium.org> Commit-Queue: Reid Kleckner <rnk@chromium.org> Cr-Commit-Position: refs/heads/master@{#562042} [modify] https://crrev.com/add2efb3a194e081d759a51117e8fe863f1380cf/build/config/sanitizers/BUILD.gn |
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by mmoroz@chromium.org
, Oct 21 2017