Compiler Failure on Google Chrome Win |
|||||||
Issue descriptionFull log: https://build.chromium.org/p/chromium.chrome/builders/Google%20Chrome%20Win/builds/5653/steps/compile/logs/stdio Snippet below for posterity. .... [20654/22094] CXX obj\chrome\browser\ui\webui\browser_ui_2.chrome_web_ui_controller_factory.obj FAILED: C:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True chrome_child.dll "C:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:chrome_child.dll.lib /DLL /OUT:chrome_child.dll @chrome_child.dll.rsp" 2 mt.exe rc.exe "obj\chrome\chrome_child_dll.chrome_child.dll.intermediate.manifest" obj\chrome\chrome_child_dll.chrome_child.dll.generated.manifest obj\third_party\WebKit\Source\core\webcore_dom.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x62BBDB03 Final: Total time = 8.359s Traceback (most recent call last): File "gyp-win-tool", line 315, in <module> sys.exit(main(sys.argv[1:])) File "gyp-win-tool", line 29, in main exit_code = executor.Dispatch(args) File "gyp-win-tool", line 71, in Dispatch return getattr(self, method)(*args[1:]) File "gyp-win-tool", line 171, in ExecLinkWithManifests subprocess.check_call(ldcmd + add_to_ld) File "C:\b\depot_tools\python276_bin\lib\subprocess.py", line 540, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'C:\b\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:chrome_child.dll.lib /DLL /OUT:chrome_child.dll @chrome_child.dll.rsp chrome_child.dll.manifest.res' returned non-zero exit status 1107 ninja: build stopped: subcommand failed.
,
Mar 30 2016
Since it happened twice in a row (https://build.chromium.org/p/chromium.chrome/builders/Google%20Chrome%20Win) it's probably some code change and not a 2015 flake
,
Mar 30 2016
Has anyone reproduced the failure locally? The fact that it happened on the reland this morning means its not a 2015 flake, but does mean that it is a compiler/linker bug. I need to get on this quickly to figure out what is triggering it. I'll start trying to reproduce it immediately.
,
Mar 30 2016
grt's currently buidling but hasn't repro'd yet. My money's on https://codereview.chromium.org/1814423002 being the cause (stuff gets too big) -- if it's that change, 2013 might have the same problem. (Still a linker bug, I suppose, but maybe not a regression)
,
Mar 30 2016
Could it really be a toolchain flake ? i.e. a (lib) file got corrupt and breaks the builds. I've seen this once before and doing a clobber fixed the issue. Should we trigger a clobber on this bot while you're investigating ?
,
Mar 30 2016
I'd like to try reverting wez's dcheck change if https://build.chromium.org/p/chromium.chrome/builders/Google%20Chrome%20Win/builds/5656 doens't cycle green. If that doesn't help either, we can try clobbering. …actually, I think the official builds already do clobber builds on each build. So clobbering won't help here.
,
Mar 30 2016
Yeah, it seems that this builder always clobber, too bad.
,
Mar 30 2016
My local build without the DCHECK change will be done building shortly. I'll revert if that fixes it.
,
Mar 30 2016
Yes, building without r384011 is good.
,
Mar 30 2016
This went away after we reverted https://codereview.chromium.org/1850473002/ I don't know if you want to keep this open to track the VS bug. I'm guessing this would've failed with 2013 too.
,
Mar 30 2016
Issue 598955 has been merged into this issue.
,
Mar 30 2016
Thanks for reverting & for the extra context, Nico. As noted offline, not able to repro this locally; Bruce, could this be a link concurrency issue?
,
Mar 30 2016
,
Mar 30 2016
I can reproduce it locally: obj\third_party\WebKit\Source\core\webcore_dom.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x62884508 I suspect that the investigation could take a little while - sorry. The first step will be to investigate it without branding=Chrome or use_goma=1, so that Microsoft can reproduce it.
,
Mar 30 2016
Oh, wait. I see the likely root cause. webcore_dom.lib is greater than 2 GiB. Dang. 2,214,885,906 webcore_dom.lib That's 0x84047A12 bytes. I'm not sure what the size used to be, but probably less than 2 GiB.
,
Mar 30 2016
Should we use the msvs_shard gyp flag on webcore_dom then ?
,
Mar 30 2016
> Should we use the msvs_shard gyp flag on webcore_dom then ? Uh, yes? I just did a test Release build of chrome_child.dll with GYP_DEFINES=<nothing> and it is 145,550,010 bytes. I'm sure that goma is the problem as it means that all of the debug information gets replicated in every .obj file. It would be interesting to see how big webcore_dom.lib is in an official build without wez's patch. Probably close to the limit.
,
Mar 30 2016
,
Mar 30 2016
The 5653 break was at 92333c0e5ca6b7768e5bcf6d41ff8fc733ebbfc1 so I tested from there - that's where I got the 2.214 GB webcore_dom.lib. Immediately prior to wez's CL (at f6d2732a2b81694b3d7ee47affaa6a22ee405cf9, whereas wez's is bac26c8a840909a679a5a74557fa6f4f60ae9e07) the size of webcore_dom.lib was: 2,080,573,500 webcore_dom.lib So, the growth from wez's change is only modest, it just happens to push the library over the edge. The next ones to watch for are these three: 1,089,473,494 cc.lib 1,129,512,300 skia_library.lib 1,323,174,742 webcore_generated.lib None of them are close enough to make me feel like doing anything about them, especially since gn's use of source sets makes this problem go away. However I will do some more test builds to see if anything else is very close. crrev.com/1842203003 should resolve this.
,
Mar 31 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/529f6f50e463a2469cda86b6363035114264bcbe commit 529f6f50e463a2469cda86b6363035114264bcbe Author: brucedawson <brucedawson@chromium.org> Date: Thu Mar 31 01:10:12 2016 Shard webcore_dom to avoid MSVS 2 GB limitation A change to tempoarily enable DCHECK on official builds caused webcore_dom to expand beyond 2 GB which caused these errors: obj\third_party\WebKit\Source\core\webcore_dom.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x62884508 msvs_shard is a way of breaking up a library to avoid this limitation. webcore_dom was already at about 2.08 billion bytes, so very close to the 2 GiB limit, and enabling DCHECK's on official builds pushed it over. There are no other .lib files which appear to be in danger of hitting the limit, although I did not check x64 builds. This change also fixes a repeated spelling error which confused me. BUG= 599186 Review URL: https://codereview.chromium.org/1842203003 Cr-Commit-Position: refs/heads/master@{#384177} [modify] https://crrev.com/529f6f50e463a2469cda86b6363035114264bcbe/third_party/WebKit/Source/core/core.gyp
,
Mar 31 2016
I did full 32-bit and 64-bit builds with the settings that were failing on the bot. The 64-bit results were virtually identical to the 32-bit results. The same three .lib files were the only ones that were larger than 1 GB, and there sizes were almost identical. So, this problem is fixed and will not come back anytime soon. Closing as fixed.
,
Mar 31 2016
Does gn do big library sharding implicitly? Or will this be broken again once we switch to gn?
,
Mar 31 2016
GN has source_set(), which avoids building static libraries at all, IIUC.
,
Mar 31 2016
Correct - source sets mean that in gn builds we generally avoid the step of copying object files to a .lib file and then linking with that. Therefore we avoid the limitations of object files, such as these size limits. It's as if the default setting in gn was msvs_shard : infinity, but for all platforms. source sets may have other problems, but they neatly solve this one. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by thakis@chromium.org
, Mar 30 2016