New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 615050 link

Starred by 3 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
EstimatedDays: ----
NextAction: 2019-07-09
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

link.exe occasionally says "invalid file or disk full" on webrtc bots

Project Member Reported by tommi@chromium.org, May 26 2016

Issue description

More details in issue 614967.  Here's the gist:

This Windows Clang builder https://build.chromium.org/p/client.webrtc/builders/Win32%20Debug%20%28Clang%29/builds/1748 fails with the following error when trying to build the audio_e2e_harness target in WebRTC:

FAILED: audio_e2e_harness.exe audio_e2e_harness.exe.pdb 
E:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True audio_e2e_harness.exe "E:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /OUT:audio_e2e_harness.exe @audio_e2e_harness.exe.rsp" 1 mt.exe rc.exe "obj\webrtc\tools\audio_e2e_harness.audio_e2e_harness.exe.intermediate.manifest" obj\webrtc\tools\audio_e2e_harness.audio_e2e_harness.exe.generated.manifest ..\..\build\win\compatibility.manifest
obj\webrtc\modules\neteq.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x78654F5A


The disk isn't full.

The way to reproduce:

1) Make sure you have the same config as this bot:
  https://build.chromium.org/p/client.webrtc/builders/Win32%20Debug%20%28Clang%29
2) Checkout webrtc.  https://webrtc.org/native-code/development/
3) Apply this patch (which we cannot land now because of this issue):
  https://codereview.webrtc.org/2018553002/
4) Build
  ninja.exe -w dupbuild=err -C out\Debug All -j80
 

Comment 1 by thakis@chromium.org, May 26 2016

Summary: link.exe says "invalid file or disk full" on webrtc clang/win bot with https://codereview.webrtc.org/2018553002/ applied (was: Clang crashes when linking)

Comment 2 by thakis@chromium.org, May 26 2016

Blocking: 82385

Comment 3 by thakis@chromium.org, May 26 2016

`gclient sync` has been printing "still working on: src" for 2h now. Is that expected?

Comment 4 by tommi@chromium.org, May 26 2016

kjellander@ can better answer that but I guess it's downloading all of chrome and then some :/  if git is still downloading, then I guess it is expected.

Comment 5 by thakis@chromium.org, May 26 2016

It eventually finished.

3) should be "apply https://codereview.webrtc.org/2014183003/" (or "revert $your_url")

Comment 6 by thakis@chromium.org, May 26 2016

I can't repro this on my box. I'm using

C:\src\webrtc\src>echo %GYP_DEFINES%
clang=1 component=shared_library dcheck_always_on=1 ffmpeg_branding=Chrome rtc_use_h264=1 target_arch=ia32

C:\src\webrtc\src>ninja -C out\Debug All
ninja: Entering directory `out\Debug'
[415/415] STAMP obj\All.actions_depends.stamp

Comment 7 by thakis@chromium.org, May 26 2016

https://bugs.chromium.org/p/chromium/issues/detail?id=599186#c19 feels somewhat similar -- are your libraries close to 2GB? If so, adding 'msvs_shard' to that target's gyp file will probably fix this (and you'd run into this with non-clang windows builds soon as well)

Comment 8 by tommi@chromium.org, May 26 2016

The biggest binaries are between 100-200MB in debug builds.  These ones that failed, are in the order of 10's of MB.  Is there a chance that this could be goma related?

Comment 9 by thakis@chromium.org, May 26 2016

No, linking doesn't run on goma.

Comment 10 by tommi@chromium.org, May 26 2016

And no chance that the obj files from goma could be bad?
It's possible in theory, but it seems very unlikely to me. Goma just runs clang-cl under wine and sends back the result. I've never seen goma sending back incorrect obj files before.

Comment 12 by tommi@chromium.org, May 26 2016

I tried to repro this as well, but I'm unable to.
I prepared yet another reland of the cl and sent it to all the Win Clang bots.  They're all green: https://codereview.webrtc.org/2014973002/

It seems like the issue must somehow be local to that bot or bots.

Comment 13 by tommi@chromium.org, May 26 2016

This seems to be related to EXPECT_DEATH tests in gtest.  I landed most of the changes without problems but when it came to the push_resampler_unittest.cc file, things started to break.

Comment 14 by tommi@chromium.org, May 26 2016

Also, in the x64 case, moving DCHECKs out into a non-templatized function, made the build pass.
Do you have a link to a CL with just the EXPECT_DEATH bit that causes the failure, and another CL that makes things go on top of that?

Comment 17 by tommi@chromium.org, May 27 2016

It's worth noting also that we have seen this error happen intermittently with other changes too (not just the CL we're discussing here) and then not happen on a subsequent build.  E.g.:

https://build.chromium.org/p/client.webrtc/builders/Win32%20Debug%20%28Clang%29/builds/1769

Comment 18 by tommi@chromium.org, May 27 2016

Also seeing similar things (different error code this time) on trybots occasionally.  Hard to tell if there's actually a disk space problem or not. Still, we're only seeing this on clang Win bots:

https://build.chromium.org/p/tryserver.chromium.win/builders/win_clang/builds/28619/steps/compile%20%28with%20patch%29/logs/stdio

[15403/27062] LINK gl_unittests.exe
FAILED: gl_unittests.exe 
E:/b/depot_tools/python276_bin/python.exe gyp-win-tool link-wrapper environment.x64 False link.exe /nologo /OUT:gl_unittests.exe /PDB:gl_unittests.exe.pdb @gl_unittests.exe.rsp
LINK : gl_unittests.exe not found or not built by the last incremental link; performing full link

LINK : fatal error LNK1201: error writing to program database 'E:\b\build\slave\win_clang\build\src\out\Debug_x64\gl_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege

Summary: link.exe occasionally says "invalid file or disk full" on clang/win bots with https://codereview.webrtc.org/2018553002/ applied (was: link.exe says "invalid file or disk full" on webrtc clang/win bot with https://codereview.webrtc.org/2018553002/ applied)
Issue 622515 has been merged into this issue.
I haven't seen this problem in a long time (probably since clang started writing real codeview info).

Has anyone else seen this recently?

Comment 22 by kbr@chromium.org, Oct 25 2016

Nope...

Status: WontFix (was: Assigned)
Cool. Shout if you see it again :-)
I saw this again today: https://build.chromium.org/p/tryserver.webrtc/builders/win_clang_dbg/builds/7719/steps/compile/logs/stdio
obj/webrtc/video/video.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x747FF49E
(the disk is not full at the bot).

WebRTC is currently at chromium revision 04e7c673d985877c240d0c2791c33e9aeeace7f8 (which is ~2 days old).
Status: Availablr (was: WontFix)
Status: Available (was: Availablr)
Summary: link.exe occasionally says "invalid file or disk full" on webrtc clang/win bots with https://codereview.webrtc.org/2018553002/ applied (was: link.exe occasionally says "invalid file or disk full" on clang/win bots with https://codereview.webrtc.org/2018553002/ applied)
Thanks! If someone sees this on non webrtc bots, please mention that too.
Cc: brucedaw...@chromium.org
https://build.chromium.org/p/client.webrtc/builders/Win64%20Debug/builds/9720/steps/generate_build_files/logs/stdio => that bot doesn't use clang-cl, does it?

https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3118/steps/generate_build_files/logs/stdio => this one doesn't either.

It sounds like you're just running in a regular link.exe bug that has nothing to do with clang/win, is that correct?

I don't remember seeing this ever on the Chromium bots, so I'd probably start by comparing how your bots are different. Are you on the same MSVS version as Chromium?
Cc: brandtr@chromium.org
Re #28: You're right, these are non-Clang bots. I didn't pay attention to this and just assume it was the same bus as we've seen before since the error message was exactly the same (except another "cannot seek to" address: 0x5045D0D4).

Our bots should be identical to Chromium's and we use the exact same Visual Studio toolchain (the 2015 redistributable).

All the errors comes from obj/webrtc/modules/rtp_rtcp/rtp_rtcp.lib so I wonder if a recent change could be related to this popping up again. It seems to have started today, nov 14 and the change log is like this:
https://chromium.googlesource.com/external/webrtc/+log/master/webrtc/modules/rtp_rtcp

brandtr: any idea since you've landed few rtp_rtcp CLs today. Did you see the same errors on the trybots?
No idea how/if this problem could be related to the rtp_rtcp code, sorry.

I did see the same error on the trybots, for example here:
https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3118/steps/compile%20with%20ninja/logs/stdio
https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_dbg/builds/3116/steps/compile%20with%20ninja/logs/stdio
When running the bots a second time, everything worked out though.

I also got another temporary error, this time on the clang bot:
https://build.chromium.org/p/tryserver.webrtc/builders/win_clang_dbg/builds/8236/steps/compile%20with%20ninja/logs/stdio
Don't know if this could be related.
Got a similar error today:

FAILED: audio_decoder_unittests.exe audio_decoder_unittests.exe.pdb 
E:/b/depot_tools/python276_bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./audio_decoder_unittests.exe /PDB:./audio_decoder_unittests.exe.pdb @./audio_decoder_unittests.exe.rsp
webrtc_common.lib(config.obj) : fatal error LNK1235: corrupt or invalid COFF symbol table

https://build.chromium.org/p/tryserver.webrtc/builders/win_x64_clang_dbg/builds/7642/steps/compile/logs/stdio


I think that these errors are sometimes caused by .lib files that have grown to large. There is either a 2 GB or 4 GB limit. I would check the size of webrtc_common.lib. A common solution is to use split_static_library as the target type in order to easily split the library into multiple parts.

However webrtc_common.lib looks too small for that to be a realistic explanation, so maybe try a clobber build?
I landed a landmine and it seems to have solved the problems for us this time, so I guess that's what we'll have to do when this resurfaces (until the bug is eventually found and fixed)
Cc: mbonadei@chromium.org
Are you still seeing this?
I think I've seen it but I'm not sure. Is it possible to query LogDog logs for free text searches??
We just saw the same problem on gpu.lib in this run: https://build.chromium.org/p/chromium.webkit/builders/WebKit%20Win%20x64%20Builder%20%28dbg%29/builds/112004

but it disappeared immediately. Not sure what caused it.

[1443/6991] LINK(DLL) gpu.dll gpu.dll.lib gpu.dll.pdb
FAILED: gpu.dll gpu.dll.lib gpu.dll.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./gpu.dll.lib /DLL /OUT:./gpu.dll /PDB:./gpu.dll.pdb @./gpu.dll.rsp
obj/third_party/angle/translator.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x2446A1D0
Cc: cwallez@chromium.org
That bot doesn't use clang, that's not this bug.

(I'd guess it's related to the 2017 revert.)
Similar to #39 I saw a compile step fail on the main waterfall's 'Win x64 Builder  (dbg)' bot today, as part of my sheriff shift:
https://build.chromium.org/p/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/58781

The error is:
FAILED: metrics_unittests.exe metrics_unittests.exe.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./metrics_unittests.exe /PDB:./metrics_unittests.exe.pdb @./metrics_unittests.exe.rsp
LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\metrics_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege

and AFAIK Win Clang is not yet enabled by default (according to  bug 82385 ).

Should we rename this bug's title to something like:
"Compile fails with fatal error LNK1201: error writing to program database"
?
Maybe we should use a different bug. The original error was this:

obj\webrtc\modules\neteq.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x78654F5A

The most recent errors are:

LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\metrics_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege

LNK1201 (PDB files) is unrelated to LNK1106 (.lib files). LNK1201 may be caused by PDBs being too large so in that sense it is similar to LNK1106 (.lib files being too large). I think LNK1201 might also be caused by random reasons. I'm not sure.

The same error (or at least very similar) just occurred again: https://build.chromium.org/p/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/59089

Log segment:
FAILED: rappor_unittests.exe rappor_unittests.exe.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./rappor_unittests.exe /PDB:./rappor_unittests.exe.pdb @./rappor_unittests.exe.rsp
LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\rappor_unittests.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege

Interesting, the "Win x64 Builder" was green immediately after BUT there is a another compile failure on the "Webkit Win x64 Builder":
https://build.chromium.org/p/chromium.webkit/builders/WebKit%20Win%20x64%20Builder%20%28dbg%29/builds/114069

The only common CL is from the depot-tools-roller:
https://chromium.googlesource.com/chromium/src/+/4b7d4b1d8b792ecec516222e2ac1babe4499b7b0

For completeness the Log segment:
FAILED: content.dll content.dll.lib content.dll.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./content.dll.lib /DLL /OUT:./content.dll /PDB:./content.dll.pdb @./content.dll.rsp
LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\win_layout\src\out\Debug_x64\content.dll.pdb'; check for insufficient disk space, invalid path, or insufficient privilege

Comment 45 by mcnee@chromium.org, Oct 12 2017

This happened again on https://uberchromegw.corp.google.com/i/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/59103

FAILED: test_ime_driver.service.exe test_ime_driver.service.exe.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./test_ime_driver.service.exe /PDB:./test_ime_driver.service.exe.pdb @./test_ime_driver.service.exe.rsp
LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\test_ime_driver.service.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege

Happened again on https://build.chromium.org/p/chromium.win/builders/Win%20x64%20Builder%20%28dbg%29/builds/59107

FAILED: sync_client.exe sync_client.exe.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./sync_client.exe /PDB:./sync_client.exe.pdb @./sync_client.exe.rsp
LINK : fatal error LNK1201: error writing to program database 'C:\b\c\b\Win_x64_Builder__dbg_\src\out\Debug_x64\sync_client.exe.pdb'; check for insufficient disk space, invalid path, or insufficient privilege

Today, on WebRTC trybots we hit a couple of LNK errors:

LNK1106:
FAILED: tools_unittests.exe tools_unittests.exe.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./tools_unittests.exe /PDB:./tools_unittests.exe.pdb @./tools_unittests.exe.rsp
obj/rtc_base/rtc_base_generic.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x61FA43B5
[559/568] LINK rtc_stats_unittests.exe rtc_stats_unittests.exe.pdb
FAILED: rtc_stats_unittests.exe rtc_stats_unittests.exe.pdb

LNK1107:
FAILED: rtc_stats_unittests.exe rtc_stats_unittests.exe.pdb 
C:/b/depot_tools/win_tools-2_7_6_bin/python/bin/python.exe ../../build/toolchain/win/tool_wrapper.py link-wrapper environment.x64 False link.exe /nologo /OUT:./rtc_stats_unittests.exe /PDB:./rtc_stats_unittests.exe.pdb @./rtc_stats_unittests.exe.rsp
obj/rtc_base/rtc_base_generic.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0xB4254A1
Do you have a link to a bot?
Blocking: -82385
Owner: ----
Summary: link.exe occasionally says "invalid file or disk full" on webrtc bots (was: link.exe occasionally says "invalid file or disk full" on webrtc clang/win bots with https://codereview.webrtc.org/2018553002/ applied)
https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.webrtc%2Fwin_x64_dbg%2F21797%2F%2B%2Frecipes%2Fsteps%2Fgenerate_build_files%2F0%2Fstdout

is_clang = false

This strongly suggests this is unrelated to clang :-) Also https://codereview.webrtc.org/2018553002/ got reverted. So retitling, and removing myself as owner (since not a clang thing).

We don't see this on the chromium bots as far as I know., so it's some link.exe behavior that's tickled by something webrtc-specific.

If it's a big problem, you could give `use_lld = true` a try to swap out the linker...
Cc: phoglund@chromium.org
Right, I quickly searched for LNK1106 and I haven't noticed this bug was related to clang.

I am not sure we want to use lld if is_clang is False.
A similar (the same?) bug was reported last year in crbug.com/691747.

I think the seek amount (78654F5A) in the original report from 2016 is suspicious. That is presumably supposed to be a file offset and that offset is almost 2 GB. That library is normally just a few MB. So, it sounds like that library is corrupt and contains a reference to a record that is far beyond its bounds.

Similarly, the most recent occurrence in rtc_base_generic.lib is for an offset of 61FA43B5 which is about 1.5 GB, and that library is just a few MB in size.

So, yeah, it sounds like a linker bug that can be triggered by a particular build setup. The original report suggested that it was reproducible for a while - is this one? If it's reproducible then I can file a bug against Microsoft's linker, but the long-term fix is almost certainly going to be switching to lld.

It happened 3 times in 30 builds on Friday and then I cannot see it anymore (70 builds without problems).

If it becomes reproducible I'll post some information here.
Labels: Pri-3
NextAction: 2019-07-09
Downgrading P2s that haven't been modified in more than 6 months, which have no component or owner.
Status: Untriaged (was: Available)
Available, but no owner or component? Please find a component, as no one will ever find this without one.

Sign in to add a comment