New issue
Advanced search Search tips

Issue 777087 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug



Sign in to add a comment

LKGR Windows ASan build is crashing on a startup since 510178:510268

Project Member Reported by mmoroz@chromium.org, Oct 21 2017

Issue description

https://pantheon.corp.google.com/logs/viewer?project=google.com%3Aclusterfuzz&minLogLevel=500&expandAll=false&resource=gce_instance&timestamp=2017-10-21T04%3A38%3A32.000000000Z&dateRangeStart=2017-10-21T03%3A38%3A33.353Z&dateRangeEnd=2017-10-21T04%3A38%3A33.353Z&interval=PT1H&advancedFilter=resource.type%3D%22gce_instance%22%0Aresource.labels.zone%3D%22us-central1-f%22%0Aresource.labels.project_id%3D%22google.com%3Aclusterfuzz%22%0Aresource.labels.instance_id%3D%224846894608969394503%22%0Atimestamp%3D%222017-10-21T04%3A38%3A18.000000000Z%22%0AinsertId%3D%22nbk9myg1dzb1h6%22


Faulting application name: chrome.exe, version: 64.0.3246.0, time stamp: 0x59eabe13
Faulting module name: chrome_elf.dll, version: 64.0.3246.0, time stamp: 0x59eabb85
Exception code: 0x4000001f
Fault offset: 0x000e1a87
Faulting process id: 0xbf4
Faulting application start time: 0x01d34a266a1ce7a0
Faulting application path: c:\clusterfuzz\slave-bot\builds\chrome-test-builds_media_win32-release_e999b74784b15cd4b248500a1cd5cd3743d01362\revisions\asan-win32-release-510633\chrome.exe
Faulting module path: c:\clusterfuzz\slave-bot\builds\chrome-test-builds_media_win32-release_e999b74784b15cd4b248500a1cd5cd3743d01362\revisions\asan-win32-release-510633\chrome_elf.dll
Report Id: a7d09adb-b619-11e7-80e2-e33582d73f46
Faulting package full name: 
Faulting package-relative application ID: 

 

Comment 1 by mmoroz@chromium.org, Oct 21 2017

Looks like a startup crash on a recent revision: https://build.chromium.org/p/chromium.lkgr/builders/Win%20ASan%20Release%20Media/builds/5603

Need to find the first revision when it started to happen.

Comment 3 by mmoroz@chromium.org, Oct 21 2017

Labels: OS-Windows

Comment 4 by mmoroz@chromium.org, Oct 21 2017

Ah, that was the wrong buildbot. The build came from https://build.chromium.org/p/chromium.lkgr/builders/Win%20ASan%20Release/builds/7009

Comment 5 by mmoroz@chromium.org, Oct 21 2017

r510178 was the latest good build we had

Comment 6 by mmoroz@chromium.org, Oct 21 2017

Summary: LKGR Windows ASan build is crashing on a startup since 510178:510268 (was: Windows failures)

Comment 7 Deleted

Comment 9 by mmoroz@chromium.org, Oct 21 2017

No idea what is causing the crash...
Cc: h...@chromium.org r...@chromium.org
Owner: thakis@chromium.org
Abhishek suggested over email that https://chromium.googlesource.com/chromium/src/+/0c0ade8e414850c6cf64fc724344c6ba350cbe6d could be the root cause
Owner: mmoroz@chromium.org
https://chromium.googlesource.com/chromium/src/+/0c0ade8e414850c6cf64fc724344c6ba350cbe6d can't be to blame for sure, it just adds a tool that's not used by anything.

https://chromium.googlesource.com/chromium/src/+/82f6080f7d0e71eabce5af2e363d375db323bea7 did hook it up, but it compares the output of rc.py to that of rc.exe (which was used before) and makes sure it's bitwise identical, which proves that the change is behavior-preserving on Windows.

So I'm 99% sure that's not it.
Oh the "hook up" cl isn't even in https://chromium.googlesource.com/chromium/src/+log/866a0da..cad2e24 -- so yes, that's not it. That just adds a tool not called by anything yet.
Cc: mmoroz@chromium.org
Owner: ----
Thanks Nico for taking a look.
Owner: och...@chromium.org
Status: Started (was: Untriaged)
assigning to myself as this week's sheriff..
Cc: -r...@chromium.org och...@chromium.org
Owner: r...@chromium.org
Status: Assigned (was: Started)
Looks like it's dying here:

(1ab8.1c78): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000048 ecx=fffffffc edx=00000000 esi=6fd01dd8 edi=00000000
eip=77cadde3 esp=04fcf258 ebp=04fcf2b8 iopl=0         nv up ei pl nz ac po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010213
ntdll!RtlpWaitOnCriticalSection+0xac:
77cadde3 ff4014          inc     dword ptr [eax+14h]  ds:002b:00000014=????????

00 04fcf2b8 77cadc8b ntdll!RtlpWaitOnCriticalSection+0xac
01 04fcf2e4 77cadcb5 ntdll!RtlpEnterCriticalSectionContended+0xa0
*** WARNING: Unable to verify checksum for C:\clusterfuzz\slave-bot\builds\chromium-browser-asan_win32-release_a15986b6468dcac5c0ae9a2e0729fd95d8ce33c8\revisions\asan-win32-release-511225\chrome_elf.dll
02 04fcf2ec 6fc4a4fa ntdll!RtlEnterCriticalSection+0x42
03 04fcf2f8 6fc444cb chrome_elf!__acrt_lock+0x15 [minkernel\crts\ucrt\src\appcrt\internal\locks.cpp @ 55]
04 (Inline) -------- chrome_elf!__acrt_lock_and_call::__l2::<lambda_cbab9ec6f41b0180b23cc171c22676b0>::operator()+0xa [minkernel\crts\ucrt\inc\corecrt_internal.h @ 909]
05 04fcf330 6fc44679 chrome_elf!__crt_seh_guarded_call<void (__cdecl*)(int)>::operator()<<lambda_cbab9ec6f41b0180b23cc171c22676b0>,<lambda_44731a7d0e6d81c3e6aa82d741081786> &,<lambda_4b292cb8dd18144e164572427af410ab> >+0x1a [minkernel\crts\ucrt\devdiv\vcruntime\inc\internal_shared.h @ 201]
06 (Inline) -------- chrome_elf!__acrt_lock_and_call+0x1d [minkernel\crts\ucrt\inc\corecrt_internal.h @ 908]
07 04fcf350 6fc41cf8 chrome_elf!__acrt_get_sigabrt_handler+0x25 [minkernel\crts\ucrt\src\appcrt\misc\signal.cpp @ 542]
08 04fcf354 6fc348c2 chrome_elf!abort+0x5 [minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 62]
09 (Inline) -------- chrome_elf!__sanitizer::dllThunkGetRealAddrOrDie+0x1e [e:\b\build\slave\win_upload_clang\build\src\third_party\llvm\projects\compiler-rt\lib\sanitizer_common\sanitizer_win_dll_thunk.cc @ 30]
0a 04fcf360 6fc2f98f chrome_elf!__sanitizer::dllThunkIntercept+0x22 [e:\b\build\slave\win_upload_clang\build\src\third_party\llvm\projects\compiler-rt\lib\sanitizer_common\sanitizer_win_dll_thunk.cc @ 36]
0b 04fcf380 77cd96de chrome_elf!intercept___asan_get_free_stack+0xf [e:\b\build\slave\win_upload_clang\build\src\third_party\llvm\projects\compiler-rt\lib\asan\asan_interface.inc @ 32]
0c 04fcf3a0 77cd9658 ntdll!LdrxCallInitRoutine+0x16
0d 04fcf3f0 77d2d625 ntdll!LdrpCallInitRoutine+0x43
0e 04fcf430 77cabd89 ntdll!LdrpCallTlsInitializers+0x7de35
0f 04fcf4a8 77ce776d ntdll!LdrpInitializeNode+0x1b0
10 04fcf4cc 77ceb530 ntdll!LdrpInitializeGraph+0x59
11 04fcf4f0 77cedca9 ntdll!LdrpInitializeGraph+0x70
12 04fcf698 77cece65 ntdll!LdrpInitializeProcess+0x1614
13 04fcf6e8 77cea9d0 ntdll!_LdrpInitialize+0x90
14 04fcf6f0 00000000 ntdll!LdrInitializeThunk+0x10

rnk, any ideas what's going on here?

Comment 16 by r...@chromium.org, Oct 25 2017

Looks like InternalGetProcAddress is failing:

uptr dllThunkGetRealAddrOrDie(const char *name) {
  uptr ret =
      __interception::InternalGetProcAddress((void *)GetModuleHandleA(0), name);
  if (!ret)
    abort();
  return ret;
}

So something may have caused __asan_get_free_stack to not be exported by chrome.exe, or otherwise modified the export table or PE headers.

Comment 17 by r...@chromium.org, Oct 25 2017

Looking at the export list from dumpbin shows that indeed it is not exported.

Comment 18 by r...@chromium.org, Oct 25 2017

The previous asan build (510178) *does* export __asan_get_free_stack from chrome.exe, so something changed in the way we link chrome.exe.

Comment 19 by r...@chromium.org, Oct 25 2017

Cc: brucedaw...@chromium.org
One interesting change in that set is the upgrade to the VS 15.4 tools: https://chromium.googlesource.com/chromium/src/+/70155edf7d3f4ffa931adc3077968ee99b9462ac

This upgrades the linker from version 14.11.25507.1 to 14.11.25547.0. I'm going to try linking some small examples with different linker versions to check if that's the issue.
The upgrade to 15.4 would also have changed the CRT libraries, presumably, which could be the issue.

Comment 21 by r...@chromium.org, Oct 25 2017

The /wholearchive: flag doesn't seem to work in the new linker version. I compared the verbose output of the linker, and in the old version I see:

Starting pass 1
Processed /DEFAULTLIB:libcmt.lib
Processed /DEFAULTLIB:oldnames.lib
        Loaded clang_rt.asan-i386.lib(ubsan_value.cc.obj)
        Loaded clang_rt.asan-i386.lib(ubsan_handlers.cc.obj)
        Loaded clang_rt.asan-i386.lib(ubsan_flags.cc.obj)
        Loaded clang_rt.asan-i386.lib(ubsan_init.cc.obj)
        Loaded clang_rt.asan-i386.lib(ubsan_diag.cc.obj)

This suggests that it loads all objects in the asan library unconditionally. The new linker output looks like:

Starting pass 1
Processed /DEFAULTLIB:libcmt.lib
Processed /DEFAULTLIB:oldnames.lib

Searching libraries
    Searching C:\src\llvm-project\build_x86_stage1\lib\clang\6.0.0\lib\windows\\clang_rt.asan-i386.lib:
      Found ___asan_init
        Referenced in t.obj
        Loaded clang_rt.asan-i386.lib(asan_rtl.cc.obj)
      Found "void __cdecl __asan::AsanDeactivate(void)" (?AsanDeactivate@__asan@@YAXXZ)
        Referenced in clang_rt.asan-i386.lib(asan_rtl.cc.obj)
        Loaded clang_rt.asan-i386.lib(asan_activation.cc.obj)

I guess we can report that as a bug. For now we may be able to work around it with a few /include:__asan_* linker flags, but it's a hack.
I tried to create a minimal repro of this and I failed. I created these two source files:


C:\src\code\repros\wholearchive>type main.cpp
#include <stdio.h>

int main() {
  printf("I'm in main.\n");
}

C:\src\code\repros\wholearchive>type lib.cpp
#include <stdio.h>

class Foo {
public:
  Foo() {
    printf("Hello world - I am being globally constructed!\n");
  }
};

Foo g_foo;



I then compiled both files, created a .lib file, and then linked main.exe twice:

cl /c main.cpp
cl /c lib.cpp
lib lib.obj /out:lib.lib
link main.obj lib.lib
main.exe
link main.obj /wholearchive:lib.lib
main.exe


The results from the two copies of main.exe were as expected if /wholearchive was working correctly:

I'm in main.

Hello world - I am being globally constructed!
I'm in main.


That doesn't mean that /wholearchive isn't broken, but it does make reporting the bug in a useful way more difficult.

Comment 23 by r...@chromium.org, Oct 25 2017

Bummer, I assumed it would be easy.

You should be able to repro with the asan archive in Chromium right now. I did this in a 32-bit VC environment:

$ cl -c main.cpp

$ ../../third_party/depot_tools/win_toolchain/vs_files/88c3b62e1eb0893b8cd57e3f4859c3af27907f64/VC/Tools/MSVC/14.11.2
5503/bin/Hostx64/x86/link.exe main.obj -libpath:../../third_party/llvm-build/Release+Asserts/lib/clang/6.0.0/lib/windows/ -wholearchive:clang_rt.asan-
i386.lib
Microsoft (R) Incremental Linker Version 14.11.25547.0
Copyright (C) Microsoft Corporation.  All rights reserved.

$ dumpbin -exports main.exe | grep asan_get_free
# not present
Project Member

Comment 24 by bugdroid1@chromium.org, Oct 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/cb2c3cc702d623e69ede3d7950b7af3afcb20ab6

commit cb2c3cc702d623e69ede3d7950b7af3afcb20ab6
Author: Reid Kleckner <rnk@google.com>
Date: Wed Oct 25 22:47:25 2017

Work around VC 15.4 /wholearchive linker bug for ASan

I randomly chose a symbol from the archive member that the linker is
failing to load and included it in the link with /include. We can remove
this extra linker flag when /wholearchive works again.

Etienne added /wholearchive back in http://crrev.com/451836

R=ochang@chromium.org
BUG= 777087 

Change-Id: I39298f0a690e05bf6758d8dc12d569762d551ca1
Reviewed-on: https://chromium-review.googlesource.com/737867
Reviewed-by: Oliver Chang <ochang@chromium.org>
Commit-Queue: Reid Kleckner <rnk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#511625}
[modify] https://crrev.com/cb2c3cc702d623e69ede3d7950b7af3afcb20ab6/build/config/sanitizers/BUILD.gn

Comment 25 by r...@chromium.org, Oct 26 2017

It's actually a bug with relative libpath flags. I moved your example into a "libasan" directory and did this:

C:\src\chromium\src\out\clang>link main.obj -libpath:libasan -wholearchive:lib.lib
Microsoft (R) Incremental Linker Version 14.11.25547.0
Copyright (C) Microsoft Corporation.  All rights reserved.


C:\src\chromium\src\out\clang>main.exe
Hello world - I am being globally constructed!
I'm in main.

C:\src\chromium\src\out\clang>link main.obj -libpath:./libasan -wholearchive:lib.lib
Microsoft (R) Incremental Linker Version 14.11.25547.0
Copyright (C) Microsoft Corporation.  All rights reserved.


C:\src\chromium\src\out\clang>main.exe
I'm in main.

Our gn config uses relative paths, like this: /LIBPATH:../../third_party/llvm-build/Release+Asserts/lib/clang/6.0.0/lib/windows

I guess that doesn't work with the new linker.
Okay, I understand what is going on now. The slashes in the /LIBPATH argument are no longer supported path-separators in 15.4, and -wholearchive is ignoring the library that it can't find (old behavior exposed by the /LIBPATH change).

Details:

I experimented with having lib.lib in a subdirectory and quickly discovered a relevant quirk: -wholearchive doesn't care whether the specified library exists or not. For instance, this is totally fine:

link main.obj -wholearchive:file_existence_errors_are_for_chumps.lib

With the knowledge that missing -wholearchive libs are silently ignored I was able to find the root cause, which is that -libpath no longer supports slashes. It requires backslashes.

Here's my modified test batch file (lib.cpp in a lib subdirectory and wholearchive

cl /c main.cpp
pushd lib
cl /c lib.cpp
lib lib.obj /out:lib.lib
popd
link main.obj lib\lib.lib
main.exe
link main.obj /libpath:./lib -wholearchive:lib.lib
main.exe


With the 15.3.2 toolchain (a9e1098bba66d2acccc377d5ee81265910f29272 hash) this works (the second invocation of main prints the constructor message) but with the 15.4 toolchain (88c3b62e1eb0893b8cd57e3f4859c3af27907f64 hash) it does not.

I'll file a bug for -wholearchive silently ignoring missing files, but it's possible that this is by-design for some obscure reason.

I *could* file a bug for /libpath not handling slashes, but I suspect it will be WontFixed because backslashes are the correct separators.

Obviously this discovery opens up the possibility of alternate fixes - changing the slashes to backslashes.


I filed a bug for /wholearchive silently ignoring missing .lib files:

https://developercommunity.visualstudio.com/content/problem/139047/wholearchive-silently-ignores-missing-lib-files.html

Please file a bug for slashes not working. It's a regression, and software relied on having both work (cf this bug). In most dev tooling pieces, both ways work.
Jessie Wu [MSFT] changed the status of the following problem:

"wholearchive silently ignores missing .lib files"
to Fixed - Pending Release and added a comment.

Thank you for your feedback! We have fixed the problem in an upcoming release. Thank you for helping us build a better Visual Studio!


That at least means that this problem will no longer silently fail - it will become a build error. If they don't fix the slash/back-slash issue then we may need to stop using /wholearchive when the /wholearchive fix ships (which is not really a big sacrifice given that it is currently a NOP)

They have fixed the slash/back-slash issue as well.

"Thank you for your feedback! We have fixed the problem in an upcoming release. Thank you for helping us build a better Visual Studio!"
Components: -Tools>Stability>ClusterFuzz
Feel free to close once you have updated the toolchain.
Owner: brucedaw...@chromium.org
The crash should no longer be happening (due to the workarounds) but I'm going to leave this open until we have a fixed toolchain (15.5?) that avoids these issues.

Note that we actually went back to 15.3 for different reasons, but that doesn't really count as "fixing" the issue.

Assigning to myself so that I remember to close it if/when 15.5 fixes the underlying issues.
Status: Fixed (was: Assigned)
The revert of the 15.4 toolchain has resolved this bug. I debated lowering the priority but decided that closing the bug made the most sense. I will test that /libpath handles slashes and that /wholearchive fails if libs are missing before we switch to the 15.5 toolchain. Here is the CL that resolved this bug:


The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/db45606398cf4389bf332b0cdcffd04e7de4a4f6

commit db45606398cf4389bf332b0cdcffd04e7de4a4f6
Author: Bruce Dawson <brucedawson@chromium.org>
Date: Fri Nov 03 23:48:27 2017

Revert "VS 2017 15.4 with 10.0.15063.468 SDK"

This reverts commit 70155edf7d3f4ffa931adc3077968ee99b9462ac.

Reason for revert: the 15.4 toolchain breaks incremental linking for
Chrome (  crbug.com/780161  )

Original change's description:
> VS 2017 15.4 with 10.0.15063.468 SDK
> 
> This change switches the VS 2017 package to use VS 2017 Update 4 while
> still using the 10.0.15063.468 SDK. Update 4 fixes at least one code-gen
> bug (   crbug.com/759402   ) but the 10.0.16299.0 SDK is still incompatible
> with Chrome ( crbug.com/773476 ).
> 
> Packaging was done on a Windows Server 2016 VM, cleanly created for this
> purpose.
> 
> Compiler was packaged up by downloading VS 2017 Update 4, from
> https://www.visualstudio.com/vs/, and then passing these parameters to
> the installer:
> 
>     --add Microsoft.VisualStudio.Workload.NativeDesktop
>     --add Microsoft.VisualStudio.Component.VC.ATLMFC --includeRecommended
>     --passive
> 
> Then the Windows 10.0.15063.468 SDK installer was run and everything was
> installed except for the Windows Performance Toolkit, .Net Framework,
> and the arm SDKs.
> 
> Then the packaging script was run like this:
> 
>     python depot_tools\win_toolchain\package_from_installed.py 2017 -w 10.0.15063.0
> 
> Bug:  773476 ,   759402   
> Change-Id: Ie2176b5ff765d9e5497f51a7b00c02fad04fb973
> Reviewed-on: https://chromium-review.googlesource.com/727523
> Reviewed-by: Scott Graham <scottmg@chromium.org>
> Commit-Queue: Bruce Dawson <brucedawson@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#510207}

TBR=thakis@chromium.org,brucedawson@chromium.org,scottmg@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug:  773476 ,    759402   
Change-Id: Ica63e9740c954499b85ee891fb3bec0d48dd70fb
Reviewed-on: https://chromium-review.googlesource.com/753422
Reviewed-by: Bruce Dawson <brucedawson@chromium.org>
Reviewed-by: John Abd-El-Malek <jam@chromium.org>
Commit-Queue: Bruce Dawson <brucedawson@chromium.org>
Cr-Commit-Position: refs/heads/master@{#513965}
[modify] https://crrev.com/db45606398cf4389bf332b0cdcffd04e7de4a4f6/build/vs_toolchain.py
Project Member

Comment 34 by bugdroid1@chromium.org, May 25 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/add2efb3a194e081d759a51117e8fe863f1380cf

commit add2efb3a194e081d759a51117e8fe863f1380cf
Author: Reid Kleckner <rnk@google.com>
Date: Fri May 25 22:37:43 2018

Drop redundant ASan library from exe link line

This works around a bug (https://llvm.org/pr37592) in LLD, where if
foo.lib appears on the link line without -wholearchive:, it doesn't take
effect. Instead, only pass the -wholearchive:foo.lib flag, which adds
the library as an input and turns on the wholearchive behavior that we
need.

Fixes base_unittests StackSamplingProfilerTest.OtherLibrary and related
tests, which were failing because chrome.exe was missing an exported
function.

Remove the /include flag workaround for  https://crbug.com/777087 .

BUG= 777087 , 844398 
R=mmoroz@chromium.org

Change-Id: I493e1dcf6963048f7e83df1c937b4a4a62dd96bb
Reviewed-on: https://chromium-review.googlesource.com/1073890
Reviewed-by: Max Moroz <mmoroz@chromium.org>
Commit-Queue: Reid Kleckner <rnk@chromium.org>
Cr-Commit-Position: refs/heads/master@{#562042}
[modify] https://crrev.com/add2efb3a194e081d759a51117e8fe863f1380cf/build/config/sanitizers/BUILD.gn

Sign in to add a comment