Issue metadata
Sign in to add a comment
|
link.exe occasionally says "invalid file or disk full" on webrtc bots |
||||||||||||||||||
Issue descriptionMore details in issue 614967. Here's the gist: This Windows Clang builder https://build.chromium.org/p/client.webrtc/builders/Win32%20Debug%20%28Clang%29/builds/1748 fails with the following error when trying to build the audio_e2e_harness target in WebRTC: FAILED: audio_e2e_harness.exe audio_e2e_harness.exe.pdb E:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True audio_e2e_harness.exe "E:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /OUT:audio_e2e_harness.exe @audio_e2e_harness.exe.rsp" 1 mt.exe rc.exe "obj\webrtc\tools\audio_e2e_harness.audio_e2e_harness.exe.intermediate.manifest" obj\webrtc\tools\audio_e2e_harness.audio_e2e_harness.exe.generated.manifest ..\..\build\win\compatibility.manifest obj\webrtc\modules\neteq.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x78654F5A The disk isn't full. The way to reproduce: 1) Make sure you have the same config as this bot: https://build.chromium.org/p/client.webrtc/builders/Win32%20Debug%20%28Clang%29 2) Checkout webrtc. https://webrtc.org/native-code/development/ 3) Apply this patch (which we cannot land now because of this issue): https://codereview.webrtc.org/2018553002/ 4) Build ninja.exe -w dupbuild=err -C out\Debug All -j80
,
May 26 2016
,
May 26 2016
`gclient sync` has been printing "still working on: src" for 2h now. Is that expected?
,
May 26 2016
kjellander@ can better answer that but I guess it's downloading all of chrome and then some :/ if git is still downloading, then I guess it is expected.
,
May 26 2016
It eventually finished. 3) should be "apply https://codereview.webrtc.org/2014183003/" (or "revert $your_url")
,
May 26 2016
I can't repro this on my box. I'm using C:\src\webrtc\src>echo %GYP_DEFINES% clang=1 component=shared_library dcheck_always_on=1 ffmpeg_branding=Chrome rtc_use_h264=1 target_arch=ia32 C:\src\webrtc\src>ninja -C out\Debug All ninja: Entering directory `out\Debug' [415/415] STAMP obj\All.actions_depends.stamp
,
May 26 2016
https://bugs.chromium.org/p/chromium/issues/detail?id=599186#c19 feels somewhat similar -- are your libraries close to 2GB? If so, adding 'msvs_shard' to that target's gyp file will probably fix this (and you'd run into this with non-clang windows builds soon as well)
,
May 26 2016
The biggest binaries are between 100-200MB in debug builds. These ones that failed, are in the order of 10's of MB. Is there a chance that this could be goma related?
,
May 26 2016
No, linking doesn't run on goma.
,
May 26 2016
And no chance that the obj files from goma could be bad?
,
May 26 2016
It's possible in theory, but it seems very unlikely to me. Goma just runs clang-cl under wine and sends back the result. I've never seen goma sending back incorrect obj files before.
,
May 26 2016
I tried to repro this as well, but I'm unable to. I prepared yet another reland of the cl and sent it to all the Win Clang bots. They're all green: https://codereview.webrtc.org/2014973002/ It seems like the issue must somehow be local to that bot or bots.
,
May 26 2016
This seems to be related to EXPECT_DEATH tests in gtest. I landed most of the changes without problems but when it came to the push_resampler_unittest.cc file, things started to break.
,
May 26 2016
Also, in the x64 case, moving DCHECKs out into a non-templatized function, made the build pass.
,
May 26 2016
Do you have a link to a CL with just the EXPECT_DEATH bit that causes the failure, and another CL that makes things go on top of that?
,
May 26 2016
This commit causes failure: https://chromium.googlesource.com/external/webrtc/+/f9d2fe983fe196373850c55acd3dc3824add480e The one that precedes it, should pass compilation: https://chromium.googlesource.com/external/webrtc/+/54e1c6a500e390e543bce7b78fae65eb9bb14ab6
,
May 27 2016
It's worth noting also that we have seen this error happen intermittently with other changes too (not just the CL we're discussing here) and then not happen on a subsequent build. E.g.: https://build.chromium.org/p/client.webrtc/builders/Win32%20Debug%20%28Clang%29/builds/1769
,
May 27 2016
Also seeing similar things (different error code this time) on trybots occasionally. Hard to tell if there's actually a disk space problem or not. Still, we're only seeing this on clang Win bots: https://build.chromium.org/p/tryserver.chromium.win/builders/win_clang/builds/28619/steps/compile%20%28with%20patch%29/logs/stdio [15403/27062] LINK gl_unittests.exe FAILED: gl_unittests.exe E:/b/depot_tools/python276_bin/python.exe gyp-win-tool link-wrapper environment.x64 False link.exe /nologo /OUT:gl_unittests.exe /PDB:gl_unittests.exe.pdb @gl_unittests.exe.rsp LINK : gl_unittests.exe not found or not built by the last incremental link; performing full link LINK : fatal error LNK1201: error writing to program database 'E:\b\build\slave\win_clang\build\src\out\Debug_x64\gl_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege
,
Jun 23 2016
,
Jun 23 2016
Issue 622515 has been merged into this issue.
,
Oct 25 2016
I haven't seen this problem in a long time (probably since clang started writing real codeview info). Has anyone else seen this recently?
,
Oct 25 2016
Nope...
,
Oct 25 2016
Cool. Shout if you see it again :-)
,
Oct 28 2016
I saw this again today: https://build.chromium.org/p/tryserver.webrtc/builders/win_clang_dbg/builds/7719/steps/compile/logs/stdio obj/webrtc/video/video.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x747FF49E (the disk is not full at the bot). WebRTC is currently at chromium revision 04e7c673d985877c240d0c2791c33e9aeeace7f8 (which is ~2 days old).
,
Oct 28 2016
,
Oct 28 2016
Thanks! If someone sees this on non webrtc bots, please mention that too.
,
Nov 14 2016
We're seeing this more and more frequently: https://build.chromium.org/p/client.webrtc/builders/Win64%20Debug/builds/9720/steps/compile%20with%20ninja/logs/stdio https://build.chromium.org/p/client.webrtc/builders/Win64%20Debug/builds/9717/steps/compile%20with%20ninja/logs/stdio https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3118/steps/compile%20with%20ninja/logs/stdio https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3109/steps/compile%20with%20ninja/logs/stdio What can we do?
,
Nov 14 2016
https://build.chromium.org/p/client.webrtc/builders/Win64%20Debug/builds/9720/steps/generate_build_files/logs/stdio => that bot doesn't use clang-cl, does it? https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3118/steps/generate_build_files/logs/stdio => this one doesn't either. It sounds like you're just running in a regular link.exe bug that has nothing to do with clang/win, is that correct? I don't remember seeing this ever on the Chromium bots, so I'd probably start by comparing how your bots are different. Are you on the same MSVS version as Chromium?
,
Nov 14 2016
Re #28: You're right, these are non-Clang bots. I didn't pay attention to this and just assume it was the same bus as we've seen before since the error message was exactly the same (except another "cannot seek to" address: 0x5045D0D4). Our bots should be identical to Chromium's and we use the exact same Visual Studio toolchain (the 2015 redistributable). All the errors comes from obj/webrtc/modules/rtp_rtcp/rtp_rtcp.lib so I wonder if a recent change could be related to this popping up again. It seems to have started today, nov 14 and the change log is like this: https://chromium.googlesource.com/external/webrtc/+log/master/webrtc/modules/rtp_rtcp brandtr: any idea since you've landed few rtp_rtcp CLs today. Did you see the same errors on the trybots?
,
Nov 14 2016
No idea how/if this problem could be related to the rtp_rtcp code, sorry. I did see the same error on the trybots, for example here: https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3118/steps/compile%20with%20ninja/logs/stdio https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3116/steps/compile%20with%20ninja/logs/stdio When running the bots a second time, everything worked out though. I also got another temporary error, this time on the clang bot: https://build.chromium.org/p/tryserver.webrtc/builders/win_clang_dbg/builds/8236/steps/compile%20with%20ninja/logs/stdio Don't know if this could be related.
,
Nov 28 2016
Got a similar error today: FAILED: audio_decoder_unittests.exe audio_decoder_unittests.exe.pdb E:/b/depot_tools/python276_bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./audio_decoder_unittests.exe /PDB:./audio_decoder_unittests.exe.pdb @./audio_decoder_unittests.exe.rsp webrtc_common.lib(config.obj) : fatal error LNK1235: corrupt or invalid COFF symbol table https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_clang_dbg/builds/7642/steps/compile/logs/stdio
,
Dec 5 2016
I think that these errors are sometimes caused by .lib files that have grown to large. There is either a 2 GB or 4 GB limit. I would check the size of webrtc_common.lib. A common solution is to use split_static_library as the target type in order to easily split the library into multiple parts. However webrtc_common.lib looks too small for that to be a realistic explanation, so maybe try a clobber build?
,
Dec 6 2016
I landed a landmine and it seems to have solved the problems for us this time, so I guess that's what we'll have to do when this resurfaces (until the bug is eventually found and fixed)
,
Apr 24 2017
,
Jul 31 2017
Are you still seeing this?
,
Aug 14 2017
I think I've seen it but I'm not sure. Is it possible to query LogDog logs for free text searches??
,
Aug 14 2017
We just saw the same problem on gpu.lib in this run: https://build.chromium.org/p/chromium.webkit/builders/WebKit%20Win%20x64%20Builder%20%28dbg%29/builds/112004 but it disappeared immediately. Not sure what caused it. [1443/6991] LINK(DLL) gpu.dll gpu.dll.lib gpu.dll.pdb FAILED: gpu.dll gpu.dll.lib gpu.dll.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./gpu.dll.lib /DLL /OUT:./gpu.dll /PDB:./gpu.dll.pdb @./gpu.dll.rsp obj/third_party/angle/translator.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x2446A1D0
,
Aug 14 2017
,
Sep 19 2017
https://build.chromium.org/p/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/58328 just failed with this on the main waterfall. See https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.win%2FWin_x64_Builder__dbg_%2F58328%2F%2B%2Frecipes%2Fsteps%2Fcompile%2F0%2Fstdout for the logs.
,
Sep 19 2017
That bot doesn't use clang, that's not this bug. (I'd guess it's related to the 2017 revert.)
,
Oct 3 2017
Similar to #39 I saw a compile step fail on the main waterfall's 'Win x64 Builder (dbg)' bot today, as part of my sheriff shift: https://build.chromium.org/p/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/58781 The error is: FAILED: metrics_unittests.exe metrics_unittests.exe.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./metrics_unittests.exe /PDB:./metrics_unittests.exe.pdb @./metrics_unittests.exe.rsp LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\metrics_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege and AFAIK Win Clang is not yet enabled by default (according to bug 82385 ). Should we rename this bug's title to something like: "Compile fails with fatal error LNK1201: error writing to program database" ?
,
Oct 3 2017
Maybe we should use a different bug. The original error was this: obj\webrtc\modules\neteq.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x78654F5A The most recent errors are: LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\metrics_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege LNK1201 (PDB files) is unrelated to LNK1106 (.lib files). LNK1201 may be caused by PDBs being too large so in that sense it is similar to LNK1106 (.lib files being too large). I think LNK1201 might also be caused by random reasons. I'm not sure.
,
Oct 12 2017
The same error (or at least very similar) just occurred again: https://build.chromium.org/p/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/59089 Log segment: FAILED: rappor_unittests.exe rappor_unittests.exe.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./rappor_unittests.exe /PDB:./rappor_unittests.exe.pdb @./rappor_unittests.exe.rsp LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\rappor_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege
,
Oct 12 2017
Interesting, the "Win x64 Builder" was green immediately after BUT there is a another compile failure on the "Webkit Win x64 Builder": https://build.chromium.org/p/chromium.webkit/builders/WebKit%20Win%20x64%20Builder%20%28dbg%29/builds/114069 The only common CL is from the depot-tools-roller: https://chromium.googlesource.com/chromium/src/+/4b7d4b1d8b792ecec516222e2ac1babe4499b7b0 For completeness the Log segment: FAILED: content.dll content.dll.lib content.dll.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./content.dll.lib /DLL /OUT:./content.dll /PDB:./content.dll.pdb @./content.dll.rsp LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\win_layout\src\out\Debug_x64\content.dll.pdb'; check for insufficient disk space, invalid path, or insufficient privilege
,
Oct 12 2017
This happened again on https://uberchromegw.corp.google.com/i/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/59103 FAILED: test_ime_driver.service.exe test_ime_driver.service.exe.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./test_ime_driver.service.exe /PDB:./test_ime_driver.service.exe.pdb @./test_ime_driver.service.exe.rsp LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\test_ime_driver.service.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege
,
Oct 12 2017
Happened again on https://build.chromium.org/p/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/59107 FAILED: sync_client.exe sync_client.exe.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./sync_client.exe /PDB:./sync_client.exe.pdb @./sync_client.exe.rsp LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\sync_client.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege
,
Apr 6 2018
Today, on WebRTC trybots we hit a couple of LNK errors: LNK1106: FAILED: tools_unittests.exe tools_unittests.exe.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./tools_unittests.exe /PDB:./tools_unittests.exe.pdb @./tools_unittests.exe.rsp obj/rtc_base/rtc_base_generic.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x61FA43B5 [559/568] LINK rtc_stats_unittests.exe rtc_stats_unittests.exe.pdb FAILED: rtc_stats_unittests.exe rtc_stats_unittests.exe.pdb LNK1107: FAILED: rtc_stats_unittests.exe rtc_stats_unittests.exe.pdb C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./rtc_stats_unittests.exe /PDB:./rtc_stats_unittests.exe.pdb @./rtc_stats_unittests.exe.rsp obj/rtc_base/rtc_base_generic.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0xB4254A1
,
Apr 6 2018
Do you have a link to a bot?
,
Apr 6 2018
Yes, I forgot to post them. LNK1106: https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/21797. LNK1107: https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/21795
,
Apr 6 2018
https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.webrtc%2Fwin_x64_dbg%2F21797%2F%2B%2Frecipes%2Fsteps%2Fgenerate_build_files%2F0%2Fstdout is_clang = false This strongly suggests this is unrelated to clang :-) Also https://codereview.webrtc.org/2018553002/ got reverted. So retitling, and removing myself as owner (since not a clang thing). We don't see this on the chromium bots as far as I know., so it's some link.exe behavior that's tickled by something webrtc-specific. If it's a big problem, you could give `use_lld = true` a try to swap out the linker...
,
Apr 6 2018
Right, I quickly searched for LNK1106 and I haven't noticed this bug was related to clang. I am not sure we want to use lld if is_clang is False.
,
Apr 6 2018
A similar (the same?) bug was reported last year in crbug.com/691747. I think the seek amount (78654F5A) in the original report from 2016 is suspicious. That is presumably supposed to be a file offset and that offset is almost 2 GB. That library is normally just a few MB. So, it sounds like that library is corrupt and contains a reference to a record that is far beyond its bounds. Similarly, the most recent occurrence in rtc_base_generic.lib is for an offset of 61FA43B5 which is about 1.5 GB, and that library is just a few MB in size. So, yeah, it sounds like a linker bug that can be triggered by a particular build setup. The original report suggested that it was reproducible for a while - is this one? If it's reproducible then I can file a bug against Microsoft's linker, but the long-term fix is almost certainly going to be switching to lld.
,
Apr 9 2018
It happened 3 times in 30 builds on Friday and then I cannot see it anymore (70 builds without problems). If it becomes reproducible I'll post some information here.
,
Jan 10
Downgrading P2s that haven't been modified in more than 6 months, which have no component or owner.
,
Jan 11
Available, but no owner or component? Please find a component, as no one will ever find this without one. |
|||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||
Comment 1 by thakis@chromium.org
, May 26 2016