New issue
Advanced search Search tips

Issue 608801 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

Incremental builds not enabled on windows in release builds

Project Member Reported by gab@chromium.org, May 3 2016

Issue description

Version: r390520
OS: Windows

What steps will reproduce the problem?
(0) From a component build
(1) ninja -C out/Release chrome
(2) touch any .cc in base
(3) ninja -C out/Release chrome

What is the expected output?
Rebuilds that .cc and relinks base.dll: 2 targets, done.

What do you see instead?
Rebuilds 172 targets (i.e. relinks all DLLs)

This appears to be because linking base.dll regenerates base.dll.lib with merely a different timestamp, forcing a re-link of every DLL that depends on that LIB :-(.

This affects developer productivity as well as bot cycle time.

This is the fault of the toolchain but Bruce suggested we could workaround this by having an extra link step to reset the .lib to the old one if we detect it only changed by a timestamp.

See attached files for an example of the impact of an incremental link of base.dll with no export changes on base.dll.lib.

Workaround: as a developer you can explicitly only rebuild base.dll in this scenario.
 
base.dll.lib.old
1.8 MB Download
base.dll.lib
1.8 MB Download
Assuming that the changing timestamp inside of base.dll.lib doesn't matter (and it probably doesn't) then this would be a good feature request for VC++ - to not touch the .lib file if nothing has changed. However they probably won't implement that feature, at least not in the near term. We could try adding a post-link step that would see if the .lib file had changed significantly and if not would restore the old version in order to avoid these wasteful relinks.

This is one instance of a general problem of unchanging files being touched.  crbug.com/607776  was another one (now fixed) and there are some other not-yet-understood cases where generated header files are (I believe) generated without changing. I believe that mksnapshot.exe generates code whenever it rebuilds, which means that it may generate new code whenever base.dll changes - a lovely cascade of consequences.

Fixing these could avoid many instances of excessive building and would significantly reduce build times in some cases.

Comment 2 by gab@chromium.org, May 4 2016

Cc: thakis@chromium.org brucedaw...@chromium.org
Owner: chrisha@chromium.org
Pasting Chris' email to us in here since we already have a bug:

"""
@thakis:

Please redirect this email if you think you're the wrong person to be receiving it!

The other day Bruce, Gab and I were talking about an apparent slowdown in the Windows GN component build. Gab was changing a single .cc file in base, launching ninja, and triggering a cascade of relinks. This was confusing Gab because none of the exports of base were being affected, so he was only expecting base.dll contents to change. I asked him to verify the contents of base.lib after each build, and indeed it was varying by a timestamp, causing any user of base.dll to be relinked.

Since MSVS is not deterministic, and likely won't be anytime soon, we discussed possible workarounds. The idea that came up was to cache an existing .lib target before linking a .dll, to compare the newly generated .lib with the old one, and to replace the new .lib with the old one if only the timestamp changed. The thought would be that this would save both developer and trybot time.

Is this a sane proposal? Would it fit reasonably into Ninja's architecture? Would it be best addressed via a wrapper to link.exe, or internally in Ninja somewhere? I'm happy to put this together, but want to know if its a sane idea and whether the Ninja folks would support it.
"""

PS: I don't think this is GN only, pretty sure GYP had the same issue given it's an MSVS toolchain problem.
Is this a regression? Are you sure it's not gn only? (building base_unittests is quick, should be fast to test)

Comment 4 by gab@chromium.org, May 4 2016

As I hinted in my PS at bottom of #2 following Chris' email. I don't think this is GN related nor even a regression. But it's still worthwhile to address. (I happen to notice now merely because I work in base a lot more these days)
"I don't think" isn't "I know" though...

Comment 6 by gab@chromium.org, May 4 2016

Interestingly (Bruce found this in parallel and I confirm), this issue doesn't repro (GYP nor GN) with a direct dependency (e.g. base_unittests or a simple component like startup_metric_utils_browser which both depend on base won't relink if only an internal detail of base changed).

So there is something else at play here..
Yeah, I think that used to work fine.

Does is still repro if you disable nacl? base gets compiled not just for the target but also with an assortment with nacl toolchains, which then make strange things happen.

Comment 8 by gab@chromium.org, May 4 2016

I have "enable_nacl = false" in GN and "disable_nacl=1" in GYP already, so yes.
The dependency that is supposed to trigger rebuilds, when necessary, is the import library - base.dll.lib. My tests show that the VC++ lib tool does *not* update this file when an internal-only change happens to a DLL - I confirmed this in a non-ninja test using VS 2013, so this has been the behavior for a while.

This then causes the relink steps to be skipped (which should cause the count of build steps to drop suddenly) because ninja realizes that base.dll.lib is not dirty.

This is all very surprising to me because I thought I had seen the same behavior which gab@ is describing.

So, gab@, we need more details. You had previously noticed that base.dll.lib was being touched, and that the internal date stamp was changing. This does not match what I am seeing so we need to figure out exactly what causes it to be regenerated.
Cc: scottmg@chromium.org
Labels: Needs-Feedback
(adding scottmg who made this work a while ago; but at the moment it's not clear what if anything is wrong here.  gab, maybe pass `-d explain` to ninja. what's `ninja --version` on your box?)
It's because incremental linking (probably) isn't on in Release, whereas it is in Debug.

With incremental off, the lib is always rewritten, but with it on the timestamp (base.dll.lib in this case) is not updated.

Here's debug component gn:

[move-devtools-netlog-observer]d:\src\cr3\src>touch base\win\enum_variant.cc

[move-devtools-netlog-observer]d:\src\cr3\src>ninja -C out\Debug chrome -v
ninja: Entering directory `out\Debug'
[171->1/172 ~0] ninja -t msvc -e environment.x86 -- "d:\src\depot_tools\win_toolchain\vs_files\95ddda401ec5678f15eeed01d2bee08fcbc5ee97\VC\bin\amd64_x86/cl.exe" /nologo /showIncludes /FC @obj/base/base/enum_variant.obj.rsp /c ../../base/win/enum_variant.cc /Foobj/base/base/enum_variant.obj /Fd"obj/base/base_cc.pdb"
[170->2/172 ~0] d:/src/depot_tools/python276_bin/python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:./base.dll.lib /DLL /OUT:./base.dll /PDB:./base.dll.pdb @./base.dll.rsp
(Just to be clear, that's all the steps that are run in #11, because the .lib timestamp hasn't changed the rest of the 172 don't need to be run.)
Thanks for the explanation Scott. That makes sense. So...

It would be great if ninja could make this work for those cases where link.exe does not. I'll file a connect bug to ask Microsoft to fix this properly but I don't have high hopes.

We used to be able to set incremental_chrome_dll=1 for gyp to get this behaviour in Release builds.

I think it should be fine to change this condition https://code.google.com/p/chromium/codesearch#chromium/src/build/config/win/BUILD.gn&l=325 or add a variable to allow override for devs who like using Release.
I guess what I mean, is that I don't think there's anything for Microsoft to fix. I wouldn't really want non-incremental to keep things around.
Incremental linking is generally a tradeoff between build time and run time - incremental builds run slightly slower. This bug identifies is a case where, if link.exe didn't update the import lib unnecessarily, we would get faster builds with no run-time impact. This is more like the incremental LTCG which greatly improves build-times without affecting performance.

Which is to say, I think that we should be able to get these build-time benefits without the costs of choosing incremental linking. So I will file a bug/feature-request, and we should consider implementing this behavior in ninja. I want to eat my cake and have it too.

Summary: Incremental builds not enabled on windows in release builds (was: Changing internal code (no export changes) in a DLL forces every DLL that depends on its LIB to re-link)
we could have a wrapper around link.exe that tells it to write the implib to a temp file and only moves it to the final place if the implib's contents change.

however, implibs contain timestamps, so i guess that bit would always change, so the bit that compares files would have to mask the timestamp out. (doing that when incremental builds are being done would break incremental linking though, so we'd have to be careful(
Bruce: I think it's fair to say you shouldn't be benchmarking or doing other very-performance-sensitive things in a component build. The differences in allocators, sundry dll boundaries, etc. would likely perturb things as much as any /incremental penalty incurred. (This doesn't really come up in static library builds.)

But anyway, yeah, if it's not too complicated/fragile it seems OK to implement a workaround.
Also, I must add )) to conserve our nation's delicate parenthesis balance.

Comment 20 by gab@chromium.org, May 4 2016

(() :-)

That'd be great (and there I was using Release builds + dcheck_always_on for "faster build times"...).

Can't we do incremental linking in Release builds though?
Labels: Te-NeedsFurtherTriage
I suspect we actually want is for incremental linking in release-component builds - yes? That is when the need is greatest, and when the cost is least important.

Comment 23 by gab@chromium.org, Jun 6 2016

Yes
In recent testing of Windows incremental build times with these args.gn settings:

    is_component_build = true
    is_debug = true
    enable_nacl = false
    is_win_fastlink = true

it was found that the incremental build time for chrome after touching this file:

    touch third_party\WebKit\Source\core\CoreInitializer.cpp

is about 180-200 s. This is because blink_core.dll is not incrementally linked so it takes ~100 s, and it then forces subsequent files to link (because blink_core.dll.lib is updated). The attached incremental.json file can be loaded into chrome://tracing to see this.

crrev.com/2095433002 is an experiment to turn on incremental linking for blink_core.dll by increasing /maxilksize. It completes with no errors but makes no differences to the build times.

If ninja could detect that there were no meaningful changes to blink_core.dll.lib and then restore it then the subsequent links would be skipped, for savings of about 100 s in this scenario.

If /maxilksize of 4 GB was actually supported then these incremental build times would drop still further, and keep more developers working on Windows instead of Linux.

Enabling incremental linking for release-component builds would also help. This is just reiterating that fact.

References:

https://codereview.chromium.org/2095433002/
https://connect.microsoft.com/VisualStudio/feedback/details/2846790

incremental.json
3.9 KB View Download
The way this works elsewhere is that we make linking a restat rule, write the link output to a temp file, and only move it to the final place if the public api changes. That way, ninja doesn't have to learn about binary internals. See e.g. https://cs.chromium.org/chromium/src/build/toolchain/mac/BUILD.gn?rcl=0&l=182

So the .lib command could do something similar.
I think the lib name can't be controlled.
Oops, ignore me, I should have checked first. I was thinking of the .ilk. (/implib should be fine).
(Stop ignoring me again) Any chance we could break up blink_core? It'd be nice to avoid downstream relinks of course too, but 100s still is kind of hopeless, if we can't get /maxilksize:+inf.

It's way, way bigger than all other components, fwiw.

d:\src\cr3\src\out\Debug>dir *.dll /o:s
...
2016-06-17  03:35 PM        34,120,704 views_content_client.dll
2016-06-22  12:41 PM        59,359,744 blink_modules.dll
2016-06-22  12:42 PM        70,433,280 content.dll
2016-06-22  12:44 PM        90,445,312 chrome.dll
2016-06-22  12:40 PM       352,113,152 blink_core.dll
Sure, we can do anything we want :-)
33,854 exports. Wow.

The list of exported names is about 6 MB. That's kind of impressive. Only content.dll is in the same ball park (slightly fewer exports, slightly longer exported names on average).
Perfect, pls make these 5 separate components then, thx. https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/BUILD.gn?rcl=0&l=164

(Looking through core/ a bit it looks like... not really.)

Components: Build
The good news is that the /maxilksize for VC++ 2015 is actually 4 GB, not 2 GB.

The bad news is that I have yet to see an incremental link succeed with /maxilksize set to more than 2 GB. I gave VC++ a linker repro and have heard nothing back. I have left the /maxilksize switch in place because I'd rather have loud failures when we hit this limit, than silent failures to incrementally link when we go beyond it.

If we can get this resolved the blink_core could incrementally link and the world would be a much better place.
Owner: brucedaw...@chromium.org
Grabbing this because I have an approved fix ready to land.
Project Member

Comment 35 by bugdroid1@chromium.org, Sep 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7bf1119ce51d5e5b489c780bbacc332eabdd6f07

commit 7bf1119ce51d5e5b489c780bbacc332eabdd6f07
Author: brucedawson <brucedawson@chromium.org>
Date: Wed Sep 14 20:36:45 2016

Enable incremental linking in release component builds

Component builds are, by definition, not optimized for maximum
performance. So incremental linking - which trades a little bit of
performance for much faster build times - is appropriate in release
component builds, as well as in debug builds.

This change enables incremental linking in release component builds.
This also requires turning off /OPT:ICF and /PROFILE in component
builds because they are incompatible with /incremental, but they are
performance related flags so they are also not important for component
builds.

blink_core.dll cannot be incrementally linked in debug x64 component
builds, but it *can* be incrementally linked in release x64 component
builds (and debug/release x86 component builds), and this change
enables incremental linking of blink_core.dll in all supported
configurations.

This drops the chrome rebuild time after touching
third_party\WebKit\Source\core\html\AutoplayExperimentHelper.cpp from
~260 seconds to 9 seconds. The chrome rebuild time after touching
base\win\registry.cc drops from ~200 seconds to less than 2 seconds,
with 21 of 24 link steps skipped entirely.

This also fixes some /INCREMENTAL /OPT:ICF mismatch warnings.

BUG= 608801 , 621236 

Review-Url: https://codereview.chromium.org/2331373006
Cr-Commit-Position: refs/heads/master@{#418660}

[modify] https://crrev.com/7bf1119ce51d5e5b489c780bbacc332eabdd6f07/build/config/compiler/BUILD.gn
[modify] https://crrev.com/7bf1119ce51d5e5b489c780bbacc332eabdd6f07/build/config/win/BUILD.gn
[modify] https://crrev.com/7bf1119ce51d5e5b489c780bbacc332eabdd6f07/third_party/WebKit/Source/core/BUILD.gn

Status: Fixed (was: Assigned)

Sign in to add a comment