Issue metadata
Sign in to add a comment
|
Windows builders fail with foo.exe has returned non-zero status: -1073741819 |
||||||||||||||||||||||||
Issue descriptionI noticed this both yesterday and today during my sheriff shift, at 17:06 and 17:37. (There was previously a bug that caused failures at specific times during APAC shifts, which is why I mention the time, although it may be a coincidence in this case) The build fails with various similar messages about processes returning large negative statuses. For example: FAILED: gen/ipc/test_proto.pb.h gen/ipc/test_proto.pb.cc pyproto/ipc/test_proto_pb2.py C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../tools/protoc_wrapper/protoc_wrapper.py test_proto.proto --protoc ./protoc.exe --proto-in-dir ../../ipc --cc-out-dir gen/ipc --py-out-dir pyproto/ipc Protoc has returned non-zero status: -1073740791 . FAILED: gen/content/test/fuzzer/html_tree.pb.h gen/content/test/fuzzer/html_tree.pb.cc pyproto/content/test/fuzzer/html_tree_pb2.py C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../tools/protoc_wrapper/protoc_wrapper.py html_tree.proto --protoc ./protoc.exe --proto-in-dir ../../content/test/fuzzer --cc-out-dir gen/content/test/fuzzer --py-out-dir pyproto/content/test/fuzzer Protoc has returned non-zero status: -1073741819 . https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium%2FWin_x64%2F16196%2F%2B%2Frecipes%2Fsteps%2Fcompile%2F0%2Fstdout FAILED: gen/blink/core/inspector/protocol.json.bro C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/gn_run_binary.py brotli.exe --force --no-copy-stat gen/blink/core/inspector/protocol.json -o gen/blink/core/inspector/protocol.json.bro brotli.exe failed with exit code -1073741819 https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.chrome%2FGoogle_Chrome_Win%2F23558%2F%2B%2Frecipes%2Fsteps%2Fcompile%2F0%2Fstdout
,
Nov 8 2017
We never found it, we moved some build files and eventually went away. But IIRC +brucedawson or +dpranke (can't remember who) recently had an internal post where they had some theories about this (sorry my memory is very weak about this, I hope I am not completely dreaming)
,
Nov 8 2017
All negative status results are not the same. Large negative status basically just means an exception - you need to convert to hex to decode: -1073741819: 0xC0000005 - STATUS_ACCESS_VIOLATION -1073740791: 0xC0000409 - STATUS_STACK_BUFFER_OVERRUN I've seen access violations randomly happen on my machine due to bogus binaries being generated - all zeroes. Which seems really weird and inexplicable. I haven't seen that for a long time. I'm not sure what triggers STATUS_STACK_BUFFER_OVERRUN. We'd really need to get copies of the bad binaries to investigate what's going wrong, or call stacks.
,
Nov 8 2017
Thanks for looking into this! Could we find an owner for this bug so it doesn't fall through cracks? Currently it stays on sheriffs' radar since it doesn't have an owner yet.
,
Nov 8 2017
That fact that it happens now after we have switched compilers (from VC++ to clang-cl) is a good data point because it proves that it isn't a compiler problem. The most likely explanation is that it is a linker problem. In the long term we will be moving away from the VC++ linker but that won't be happening soon. Unless we can repro the problem I'm not sure what we can do other than close as norepro and move on.
,
Nov 9 2017
This has just closed the tree again: https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.chrome%2FGoogle_Chrome_Win%2F23700%2F%2B%2Frecipes%2Fsteps%2Fcompile%2F0%2Fstdout ------------------- 8< ------------------- [4817/44007] ACTION //content/browser/devtools:compressed_protocol_json(//build/toolchain/win:win_clang_x86) FAILED: gen/blink/core/inspector/protocol.json.bro C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/gn_run_binary.py brotli.exe --force --no-copy-stat gen/blink/core/inspector/protocol.json -o gen/blink/core/inspector/protocol.json.bro brotli.exe failed with exit code -1073741819 [4818/44007] ACTION //components/resources:compressed_about_credits(//build/toolchain/win:win_clang_x86) FAILED: gen/components/resources/about_credits.bro C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/gn_run_binary.py brotli.exe --force --no-copy-stat gen/components/resources/about_credits.html -o gen/components/resources/about_credits.bro brotli.exe failed with exit code -1073741819 ------------------- 8< -------------------
,
Nov 9 2017
,
Nov 9 2017
Until someone fixes the bug, sheriffs are just going to keep filing similar bugs over and over again. Since we don't do sheriff hand-offs. I filed bug 779660 before.
,
Nov 9 2017
,
Nov 10 2017
There's nothing actionable for Chromium sheriffs here. I don't know what to do about the fact that removing the Sheriff-Chromium label means that sheriffs will see this as a new problem if/when it shows up again (as Lei pointed out in c#8), but keeping the Sheriff-Chromium label just has it making noise on sheriff-o-matic.
,
Nov 17 2017
I hit this on one of my workstations and was able to investigate. In this case it was genstring.exe that was crashing. When I ran it it crashed in mainCRTStartup and the assembly language looked like this: 000000014000109B 00 00 add byte ptr [rax],al 000000014000109D 00 00 add byte ptr [rax],al 000000014000109F 00 00 add byte ptr [rax],al 00000001400010A1 00 00 add byte ptr [rax],al 00000001400010A3 00 00 add byte ptr [rax],al mainCRTStartup: 00000001400010A5 00 00 add byte ptr [rax],al 00000001400010A7 00 00 add byte ptr [rax],al 00000001400010A9 00 00 add byte ptr [rax],al 00000001400010AB 00 00 add byte ptr [rax],al 00000001400010AD 00 00 add byte ptr [rax],al _get_startup_commit_mode: 00000001400010AF 00 00 add byte ptr [rax],al 00000001400010B1 00 00 add byte ptr [rax],al 00000001400010B3 00 00 add byte ptr [rax],al 00000001400010B5 00 00 add byte ptr [rax],al 00000001400010B7 00 00 add byte ptr [rax],al I then forced a relink (no recompilation) and on the next run it worked and the code for mainCRTStartup looked like this: __GSHandlerCheckCommon: 00000001400010A0 E9 1B 3F 00 00 jmp __GSHandlerCheckCommon (0140004FC0h) mainCRTStartup: 00000001400010A5 E9 B6 22 00 00 jmp mainCRTStartup (0140003360h) __scrt_get_dyn_tls_dtor_callback: 00000001400010AA E9 21 34 00 00 jmp __scrt_get_dyn_tls_dtor_callback (01400044D0h) _get_startup_commit_mode: 00000001400010AF E9 BC 32 00 00 jmp _get_startup_commit_mode (0140004370h) What's going on is that this is an array of five-byte thunks, used in incremental linking to let the linker move functions around easily. In the bad builds the thunks are all zeroes which tends to be crashy. So... 1) It's not a compiler bug. The object files are fine because relinking fixes the issue. But we already knew this because the bug happened with both VC++ and clang 2) It is an incremental linking linker bug.
,
Nov 17 2017
,
Nov 17 2017
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by no...@chromium.org
, Nov 7 2017Labels: Type-Bug