New issue
Advanced search Search tips

Issue 599186 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug

Blocking:
issue 440500
issue 596231



Sign in to add a comment

Compiler Failure on Google Chrome Win

Project Member Reported by rdevlin....@chromium.org, Mar 30 2016

Issue description

Full log: https://build.chromium.org/p/chromium.chrome/builders/Google%20Chrome%20Win/builds/5653/steps/compile/logs/stdio

Snippet below for posterity.

....
[20654/22094] CXX obj\chrome\browser\ui\webui\browser_ui_2.chrome_web_ui_controller_factory.obj
FAILED: C:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True chrome_child.dll "C:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:chrome_child.dll.lib /DLL /OUT:chrome_child.dll @chrome_child.dll.rsp" 2 mt.exe rc.exe "obj\chrome\chrome_child_dll.chrome_child.dll.intermediate.manifest" obj\chrome\chrome_child_dll.chrome_child.dll.generated.manifest
obj\third_party\WebKit\Source\core\webcore_dom.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x62BBDB03

Final: Total time = 8.359s

Traceback (most recent call last):

  File "gyp-win-tool", line 315, in <module>

    sys.exit(main(sys.argv[1:]))

  File "gyp-win-tool", line 29, in main

    exit_code = executor.Dispatch(args)

  File "gyp-win-tool", line 71, in Dispatch

    return getattr(self, method)(*args[1:])

  File "gyp-win-tool", line 171, in ExecLinkWithManifests

    subprocess.check_call(ldcmd + add_to_ld)

  File "C:\b\depot_tools\python276_bin\lib\subprocess.py", line 540, in check_call

    raise CalledProcessError(retcode, cmd)

subprocess.CalledProcessError: Command 'C:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:chrome_child.dll.lib /DLL /OUT:chrome_child.dll @chrome_child.dll.rsp chrome_child.dll.manifest.res' returned non-zero exit status 1107

ninja: build stopped: subcommand failed.
 

Comment 1 by thakis@chromium.org, Mar 30 2016

(Smells like another 2015 thing to me, which is why I asked rdevlin to mark this blocking for 440500)

Comment 2 by thakis@chromium.org, Mar 30 2016

Since it happened twice in a row (https://build.chromium.org/p/chromium.chrome/builders/Google%20Chrome%20Win) it's probably some code change and not a 2015 flake
Has anyone reproduced the failure locally? The fact that it happened on the reland this morning means its not a 2015 flake, but does mean that it is a compiler/linker bug. I need to get on this quickly to figure out what is triggering it.

I'll start trying to reproduce it immediately.

Comment 4 by thakis@chromium.org, Mar 30 2016

grt's currently buidling but hasn't repro'd yet.

My money's on https://codereview.chromium.org/1814423002 being the cause (stuff gets too big) -- if it's that change, 2013 might have the same problem. (Still a linker bug, I suppose, but maybe not a regression)
Could it really be a toolchain flake ? i.e. a (lib) file got corrupt and breaks the builds. I've seen this once before and doing a clobber fixed the issue. Should we trigger a clobber on this bot while you're investigating ?

Comment 6 by thakis@chromium.org, Mar 30 2016

I'd like to try reverting wez's dcheck change if https://build.chromium.org/p/chromium.chrome/builders/Google%20Chrome%20Win/builds/5656 doens't cycle green. If that doesn't help either, we can try clobbering.

…actually, I think the official builds already do clobber builds on each build. So clobbering won't help here.
Yeah, it seems that this builder always clobber, too bad.

Comment 8 by grt@chromium.org, Mar 30 2016

My local build without the DCHECK change will be done building shortly. I'll revert if that fixes it.

Comment 9 by grt@chromium.org, Mar 30 2016

Yes, building without r384011 is good.
Cc: w...@chromium.org
This went away after we reverted https://codereview.chromium.org/1850473002/

I don't know if you want to keep this open to track the VS bug. I'm guessing this would've failed with 2013 too.

Comment 11 by w...@chromium.org, Mar 30 2016

Cc: brucedaw...@chromium.org skobes@chromium.org shinyak@chromium.org
 Issue 598955  has been merged into this issue.

Comment 12 by w...@chromium.org, Mar 30 2016

Labels: OS-Windows
Thanks for reverting & for the extra context, Nico.

As noted offline, not able to repro this locally; Bruce, could this be a link concurrency issue?

Comment 13 by w...@chromium.org, Mar 30 2016

Blocking: 596231
I can reproduce it locally:

obj\third_party\WebKit\Source\core\webcore_dom.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x62884508

I suspect that the investigation could take a little while - sorry. The first step will be to investigate it without branding=Chrome or use_goma=1, so that Microsoft can reproduce it.
Oh, wait. I see the likely root cause. webcore_dom.lib is greater than 2 GiB. Dang.

2,214,885,906 webcore_dom.lib

That's 0x84047A12 bytes. I'm not sure what the size used to be, but probably less than 2 GiB.
Should we use the msvs_shard gyp flag on webcore_dom then ?
> Should we use the msvs_shard gyp flag on webcore_dom then ?

Uh, yes?

I just did a test Release build of chrome_child.dll with GYP_DEFINES=<nothing> and it is 145,550,010 bytes. I'm sure that goma is the problem as it means that all of the debug information gets replicated in every .obj file.

It would be interesting to see how big webcore_dom.lib is in an official build without wez's patch. Probably close to the limit.

Comment 18 by w...@chromium.org, Mar 30 2016

Status: Started (was: Assigned)
The 5653 break was at 92333c0e5ca6b7768e5bcf6d41ff8fc733ebbfc1 so I tested from there - that's where I got the 2.214 GB webcore_dom.lib.

Immediately prior to wez's CL (at f6d2732a2b81694b3d7ee47affaa6a22ee405cf9, whereas wez's is bac26c8a840909a679a5a74557fa6f4f60ae9e07) the size of webcore_dom.lib was:

2,080,573,500 webcore_dom.lib

So, the growth from wez's change is only modest, it just happens to push the library over the edge.

The next ones to watch for are these three:

1,089,473,494 cc.lib
1,129,512,300 skia_library.lib
1,323,174,742 webcore_generated.lib

None of them are close enough to make me feel like doing anything about them, especially since gn's use of source sets makes this problem go away. However I will do some more test builds to see if anything else is very close.

crrev.com/1842203003 should resolve this.

Project Member

Comment 20 by bugdroid1@chromium.org, Mar 31 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/529f6f50e463a2469cda86b6363035114264bcbe

commit 529f6f50e463a2469cda86b6363035114264bcbe
Author: brucedawson <brucedawson@chromium.org>
Date: Thu Mar 31 01:10:12 2016

Shard webcore_dom to avoid MSVS 2 GB limitation

A change to tempoarily enable DCHECK on official builds caused
webcore_dom to expand beyond 2 GB which caused these errors:

  obj\third_party\WebKit\Source\core\webcore_dom.lib : fatal error
  LNK1107: invalid or corrupt file: cannot read at 0x62884508

msvs_shard is a way of breaking up a library to avoid this limitation.

webcore_dom was already at about 2.08 billion bytes, so very close to
the 2 GiB limit, and enabling DCHECK's on official builds pushed it over.

There are no other .lib files which appear to be in danger of hitting
the limit, although I did not check x64 builds.

This change also fixes a repeated spelling error which confused me.

BUG= 599186 

Review URL: https://codereview.chromium.org/1842203003

Cr-Commit-Position: refs/heads/master@{#384177}

[modify] https://crrev.com/529f6f50e463a2469cda86b6363035114264bcbe/third_party/WebKit/Source/core/core.gyp

Status: Fixed (was: Started)
I did full 32-bit and 64-bit builds with the settings that were failing on the bot. The 64-bit results were virtually identical to the 32-bit results. The same three .lib files were the only ones that were larger than 1 GB, and there sizes were almost identical.

So, this problem is fixed and will not come back anytime soon.

Closing as fixed.
Does gn do big library sharding implicitly? Or will this be broken again once we switch to gn?

Comment 23 by w...@chromium.org, Mar 31 2016

GN has source_set(), which avoids building static libraries at all, IIUC.
Correct - source sets mean that in gn builds we generally avoid the step of copying object files to a .lib file and then linking with that. Therefore we avoid the limitations of object files, such as these size limits. It's as if the default setting in gn was msvs_shard : infinity, but for all platforms.

source sets may have other problems, but they neatly solve this one.

Sign in to add a comment