New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 619949 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: ----

Blocking:
issue 354261
issue 621677



Sign in to add a comment

Win x64 Builder failure on chromium.perf

Project Member Reported by zh...@chromium.org, Jun 14 2016

Issue description

https://build.chromium.org/p/chromium.perf/builders/Win%20x64%20Builder?numbuilds=200

First seen at https://build.chromium.org/p/chromium.perf/builders/Win%20x64%20Builder/builds/9544

[5442/5460] LINK(DLL) chrome.dll chrome.dll.lib
FAILED: chrome.dll chrome.dll.lib 
C:/b/depot_tools/python276_bin/python.exe gyp-win-tool link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./chrome.dll.lib /DLL /OUT:./chrome.dll /PDB:./chrome.dll.pdb @./chrome.dll.rsp
obj/chrome/browser/ui/ui.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x14B19402

[5443/5460] LINK(DLL) chrome_child.dll chrome_child.dll.lib
ninja: build stopped: subcommand failed.


Looks like https://codereview.chromium.org/1517463003 causes it, which modified some ui files. Will go ahead and revert the CL.
 
Project Member

Comment 1 by bugdroid1@chromium.org, Jun 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/25722a4a11f04300296119496fcc644a4fb03667

commit 25722a4a11f04300296119496fcc644a4fb03667
Author: zhenw <zhenw@chromium.org>
Date: Tue Jun 14 15:18:47 2016

Revert of Added comprehensive tests for views::Border. (patchset #13 id:260001 of https://codereview.chromium.org/1517463003/ )

Reason for revert:
This CL causes Win x64 Builder failure on chromium.perf.

https://build.chromium.org/p/chromium.perf/builders/Win%20x64%20Builder/builds/9544

[5442/5460] LINK(DLL) chrome.dll chrome.dll.lib
FAILED: chrome.dll chrome.dll.lib
C:/b/depot_tools/python276_bin/python.exe gyp-win-tool link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./chrome.dll.lib /DLL /OUT:./chrome.dll /PDB:./chrome.dll.pdb @./chrome.dll.rsp
obj/chrome/browser/ui/ui.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x14B19402

[5443/5460] LINK(DLL) chrome_child.dll chrome_child.dll.lib
ninja: build stopped: subcommand failed.

BUG= 619949 

Original issue's description:
> Added comprehensive tests for views::Border.
>
> BUG= 568389 
>
> Committed: https://crrev.com/92614dc154ad143a1ef9849ae0fe90c02df09f30
> Cr-Commit-Position: refs/heads/master@{#399650}

TBR=sadrul@chromium.org,danakj@chromium.org,mgiuca@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 568389 

Review-Url: https://codereview.chromium.org/2060983004
Cr-Commit-Position: refs/heads/master@{#399709}

[delete] https://crrev.com/8458e05a54ad72235f9329863ebb87307ace8291/ui/views/border_unittest.cc
[modify] https://crrev.com/25722a4a11f04300296119496fcc644a4fb03667/ui/views/views.gyp

Comment 2 by zh...@chromium.org, Jun 14 2016

Cc: fmea...@chromium.org
Hmm, seems the revert didn't work...

Last known good compile was 9541: https://build.chromium.org/p/chromium.perf/builders/Win%20x64%20Builder/builds/9541

Then we have compile exceptions in 9542 and 9543. We are not entirely sure if they would pass compilation. Then compile fail in 9544.

All three have touched chrome/browser/ui/. The range for 9532-9544 is: 399598-399650

Comment 3 by zh...@chromium.org, Jun 14 2016

Starting return code bisect at:
https://chromeperf.appspot.com/group_report?bug_id=619949
This is a Visual Studio linker limitation, we need to do something like this https://codereview.chromium.org/2046573003/ to fix it.
Cc: brucedaw...@chromium.org
+brucedawson@

looks like our first LNK1107 (lib file > 2GB) with GN, any idea how to fix this ? The sourceset for this target[1] should probably be splitted ?

[1] https://cs.chromium.org/chromium/src/chrome/browser/ui/BUILD.gn?q=chrome/browser/ui/bui&sq=package:chromium&l=32
Project Member

Comment 6 by bugdroid1@chromium.org, Jun 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/25722a4a11f04300296119496fcc644a4fb03667

commit 25722a4a11f04300296119496fcc644a4fb03667
Author: zhenw <zhenw@chromium.org>
Date: Tue Jun 14 15:18:47 2016

Revert of Added comprehensive tests for views::Border. (patchset #13 id:260001 of https://codereview.chromium.org/1517463003/ )

Reason for revert:
This CL causes Win x64 Builder failure on chromium.perf.

https://build.chromium.org/p/chromium.perf/builders/Win%20x64%20Builder/builds/9544

[5442/5460] LINK(DLL) chrome.dll chrome.dll.lib
FAILED: chrome.dll chrome.dll.lib
C:/b/depot_tools/python276_bin/python.exe gyp-win-tool link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./chrome.dll.lib /DLL /OUT:./chrome.dll /PDB:./chrome.dll.pdb @./chrome.dll.rsp
obj/chrome/browser/ui/ui.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x14B19402

[5443/5460] LINK(DLL) chrome_child.dll chrome_child.dll.lib
ninja: build stopped: subcommand failed.

BUG= 619949 

Original issue's description:
> Added comprehensive tests for views::Border.
>
> BUG= 568389 
>
> Committed: https://crrev.com/92614dc154ad143a1ef9849ae0fe90c02df09f30
> Cr-Commit-Position: refs/heads/master@{#399650}

TBR=sadrul@chromium.org,danakj@chromium.org,mgiuca@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= 568389 

Review-Url: https://codereview.chromium.org/2060983004
Cr-Commit-Position: refs/heads/master@{#399709}

[delete] https://crrev.com/8458e05a54ad72235f9329863ebb87307ace8291/ui/views/border_unittest.cc
[modify] https://crrev.com/25722a4a11f04300296119496fcc644a4fb03667/ui/views/views.gyp

Comment 7 by zh...@chromium.org, Jun 15 2016

Cc: picksi@chromium.org
+picksi (next sheriff)

The bisect is still running. Comment 6 is just a repetition of comment 1 (not sure why it happened).

Comment 8 by picksi@chromium.org, Jun 15 2016

Cc: brettw@chromium.org
Components: Infra>Labs
Labels: Infra-Troopers
Is this potentially an infra-structure issue? There is occasional purple and the failure in the logs says:

FAILED: chrome.dll chrome.dll.lib 
C:/b/depot_tools/python276_bin/python.exe gyp-win-tool link-wrapper environment.x64 False link.exe /nologo /IMPLIB:./chrome.dll.lib /DLL /OUT:./chrome.dll /PDB:./chrome.dll.pdb @./chrome.dll.rsp
obj/chrome/browser/ui/ui.lib : fatal error LNK1107: invalid or corrupt file: cannot read at 0x14B62506

Has a file become locked or something? Adding infra>labs and infra-troopers in case it is an infra problem.
No, the problem is that there's a 2GB limitation on the size of the static libraries on Windows (.lib). You can still produce a library bigger than that but the linker won't be able to read it and you'll hit the error 'LNK1107'.

We had this issue several time with Gyp but there was an easy way to fix it (msvs_shard[1]) but this doesn't exist in GN (afaik). We thought that this wouldn't be an issue in GN because the sourcesets (i.e. it sounded like a nice way to implicitly limit the number of files that get compiled in the same static library), but it look like it's not the case (i.e. some source sets are huge).

So the solution here is to split the ui.lib sourceset defined here: https://cs.chromium.org/chromium/src/chrome/browser/ui/BUILD.gn?q=chrome/browser/ui/bui&sq=package:chromium&l=32 , I don't know if there's an easy way to do this.


[1] https://cs.chromium.org/chromium/src/tools/gyp/pylib/gyp/MSVSUtil.py?q=msvs_shard+file:%5Esrc/tools/gyp/&sq=package:chromium&dr=C&l=73

Comment 10 Deleted

Comment 11 Deleted

Comment 12 Deleted

Comment 13 Deleted

(Please disregard my last three deleted comments: they were about the FYI builder rather than the perf.fyi builder)
Wow, I just did a local test and obj/chrome/browser/ui/ui.lib has a size of ~4GB, this is way above the 2GB limit !
You should try to revert https://codereview.chromium.org/2060983002 to fix this, it'll have some performance implication but it'll fix the build.
I'm reverting this now
CL for revert is here:
https://codereview.chromium.org/2071493002/
I will change this target to be a source set in 64-bits.

This gets around the issue easily (in case it comes up for other targets) but causes some slowdown in linking so we prefer static libraries for these.

Reverting the patch will cause some regressions in cycle times for Windows builders which are already on the edge of their SLA. Since this patch has landed we flipped some more Windows bots.
Cc: dpranke@chromium.org
Cc: nednguyen@chromium.org
Owner: brettw@chromium.org
Status: Started (was: Untriaged)
+1 to what brettw wrote in #19 ... reverting this CL will affect the CQ in a bad way; brettw@ has a CL posted that should fix the perf bot and we should land that instead.
For completeness brettw's CL is here: https://codereview.chromium.org/2067143003/
awesome, I'll do more tests on my end to check if more static libraries are about to hit the 2GB lib file limit.

Would it make sense to have an equivalent to the GYP's msvs_shard flag in GN ?
The much easier solution is to make them source sets. The shard thing is kind of a complicated hack. The fact that these are shared libraries is only a linking performance optimization anyway.
yeah, it's just that it's sad to have to sacrifice the linking performance optimizations. So I understand that the short term easy solution is to make them source sets but we should probably try to find a better long term one ? The solution might be to split the ui static library into several distinct libraries ? i.e. rework its build file. (I'm not familiar with this part of the codebase so I don't know if there's a clear way to split this into separate UI components).
I think there's a bunch more work to do here as we try to figure out the best way to speed up link time and keep things below the limits.
Ok, I just want to make sure that we do something about these limitations. This LNK1107 link error caused us a lot of trouble with Gyp as not so many people know the msvs_shard option. A lot of people have wasted a lot of time trying to fix the build (generally by randomly reverting some CLs) and I want to make sure that we don't start hitting this with GN as well.
Project Member

Comment 28 by bugdroid1@chromium.org, Jun 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c840a2adcf6540d189c41e7cb9825536035082a4

commit c840a2adcf6540d189c41e7cb9825536035082a4
Author: brettw <brettw@chromium.org>
Date: Wed Jun 15 16:20:54 2016

Make chrome/browser/ui a source set on 64-bit official builds.

The static library for this target was exceeding 4GB on Windows. This condition matches the one we added chrome/browser.

BUG= 619949 
TBR=picksi1

Review-Url: https://codereview.chromium.org/2067143003
Cr-Commit-Position: refs/heads/master@{#399916}

[modify] https://crrev.com/c840a2adcf6540d189c41e7cb9825536035082a4/chrome/browser/ui/BUILD.gn

@sebmarchand - yup, no question.
The build has failed but for a different reason (timeout, I'm 'fixing' this in https://codereview.chromium.org/2070623002/)

The fact that it timed out make me believe that the original issue (LNK1107) is fixed as it was usually failing after ~40 mins, now we're just timing out while linking the big tests.
It's not super hard to split these targets up manually if we desire.
Apparently the file size limit for .lib files has been raised recently from 2 GiB to 4 GiB. This is proven by the fact that I've seen ui.lib be successfully used at just over 3.9 GiB.

Given how much .lib file sizes can very between build flavors I'm not surprised that it broke 4 GB on some variants.

The second largest .lib file on the build I examined was only 1.5 GiB so if we fix ui.lib we will have a lot of headroom.

Large source sets are proving problematic for build times (see comment #30) so we probably need to break ui.lib into two or three .lib files - the manual equivalent of msvs_shard.

This splitting will be easier after we remove GYP, since currently the file list is slurped out of a .gypi.
Friendly sheriff ping

(According to the Perf Bot Sheriffing guide, Pri-1 build bot bugs should be pinged daily. Please close the issue if it's already been fixed.)
Blocking: 621677 354261
Labels: M-53 Proj-GN-Migration
The address error on the UI target should be fixed now. It looks like currently the bot is just timing out a lot of the time (sometimes it succeeds).
Cc: pgervais@chromium.org
Owner: pgervais@chromium.org
Stealing this ticket, time to disconnect faulty machines.
So, there are several issues that are in play here. brettw@ fixed one of them.

However, another is that there are three machines serving this builder:

- build131-m1, which has 128GB of RAM and seems to currently be happy
- build34-m1, which has 64GB and isn't
- vm326-m1, which has 16GB and is even less so.

If I'm reading the output correctly, all of them have 8 cores.

I've asked pgervais@ to shut off build34-m1 and vm326-m1 temporarily. We'll drop down to even slower cycle times, but it should at least work.

Once we get things working somewhat reliably, it looks like the builder cycle time is still maybe 4x slower than GYP, so we still have work to do there.
Owner: brettw@chromium.org
I disconnected vm326-m1 and build34-m1. Reassigning the ticket to brettw@
Owner: dpranke@chromium.org
I'll take this over for now ... brettw@'s going to work on GN link time stuff, but I'll own the overall issue.
Issue 617982 has been merged into this issue.

Comment 43 by d...@chromium.org, Jun 23 2016

Owner: d...@chromium.org
Talked to Dirk about this in person.. we're going to make sure all the hw matches for this builder.

I'll replace build34-m1 in place and allocate a new machine for vm326-m1.
thanks, dba@!
Project Member

Comment 45 by bugdroid1@chromium.org, Jun 23 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome-golo/chrome-golo.git/+/bfa8014a9b8c8148b3492ac90efc21afa355183a

commit bfa8014a9b8c8148b3492ac90efc21afa355183a
Author: dba <dba@google.com>
Date: Thu Jun 23 22:52:33 2016

Project Member

Comment 46 by bugdroid1@chromium.org, Jun 24 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome-golo/chrome-golo.git/+/8115d7b047d585a8a3a434014be8a62e5b235ab9

commit 8115d7b047d585a8a3a434014be8a62e5b235ab9
Author: Bryce Albritton <dba@google.com>
Date: Fri Jun 24 02:01:18 2016

Comment 47 by d...@chromium.org, Jun 24 2016

Cc: d...@chromium.org
Owner: dpranke@chromium.org
build34-m1 has been redeployed on newer hardware and build13-m1 is ready to take the place for vm326-m1 once someone updates the slaves config.
Project Member

Comment 48 by bugdroid1@chromium.org, Jun 24 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d28aba56dde82051680f9048daf1b1501b9bd89a

commit d28aba56dde82051680f9048daf1b1501b9bd89a
Author: brettw <brettw@chromium.org>
Date: Fri Jun 24 03:04:28 2016

Make GN components static libraries on Win x64

For the static builds, previously there was a static library exception for the windows official builds. They were source sets instead on the theory that some of them could get over 2GB.

There's only one component over 2GB (blink_modules). The rest of the targets that were too large before were all explicit static libraries which have been dealt with already. This change whitelists blink_modules. After that, the largest static library is now blink_platform (1.2GB). The rest are less than 1 GB.

So we can now disable the special case for this configuration.

BUG= 619949 

Review-Url: https://codereview.chromium.org/2098673002
Cr-Commit-Position: refs/heads/master@{#401800}

[modify] https://crrev.com/d28aba56dde82051680f9048daf1b1501b9bd89a/build/config/BUILDCONFIG.gn
[modify] https://crrev.com/d28aba56dde82051680f9048daf1b1501b9bd89a/third_party/WebKit/Source/modules/BUILD.gn

Project Member

Comment 49 by bugdroid1@chromium.org, Jun 24 2016

Project Member

Comment 50 by bugdroid1@chromium.org, Jun 24 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/master-manager.git/+/1a2009cb0519e43151ba0a1f033edf30cd782b9a

commit 1a2009cb0519e43151ba0a1f033edf30cd782b9a
Author: dpranke <dpranke@google.com>
Date: Fri Jun 24 03:13:55 2016

Owner: d...@chromium.org
bounce, bounce ...

The master has been restarted and the new slave added.

dba@, pgervais@, can one of you start up the new builder (build13-m1)?
Just restarted build13-m1.  It's now connected up.
Status: Fixed (was: Started)
Thanks, pschmidt@!

Looks like things should be happy now, so I'm closing this.

Sign in to add a comment