New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 668748 link

Starred by 8 users

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug

Blocking:
issue 82385



Sign in to add a comment

Huge compile time regression on Windows due to v8 change (full build up to 25% slower, v8_shell build up to 400% slower)

Project Member Reported by thakis@chromium.org, Nov 25 2016

Issue description

https://build.chromium.org/p/chromium.fyi/stats/ClangToTWin(dbg) (also see screenshot)

That's like a 25% regression!

First slow is https://build.chromium.org/p/chromium.fyi/builders/ClangToTWin%28dbg%29/builds/7139 . clang-side regression range is `svn log -r287772:287838 https://nico@llvm.org/svn/llvm-project` which contains r287774 (the side-table removal) -- rnk, that commit isn't expected to slow down builds, is it?

Other than that, r281797 also sounds like it could make things slower, but some other bots I've looked at didn't see a regression this large.


(We also really need automatic tracking of build times -- me randomly noticing these isn't the best setup)
 
Screen Shot 2016-11-25 at 4.24.21 PM.png
563 KB View Download

Comment 1 by thakis@chromium.org, Nov 28 2016

chromium side #434180:#434276

Comment 2 by r...@chromium.org, Nov 28 2016

I'd expect the default arg expr side table removal (r287774) to speed things up.

Are the other clang bots affected, or is this only Win+dbg? If so, I'd look for debug info related commits, or Chromium side config changes.

Comment 3 by thakis@chromium.org, Nov 28 2016

I couldn't repro locally trying to build base_unittests.

But then I looked at http://chromium-build-stats.appspot.com/ninja_log/2016/11/23/build128-m1/ninja_log.build128-m1.chrome-bot.20161123-111858.5312.gz/trace.html and http://chromium-build-stats.appspot.com/ninja_log/2016/11/24/build128-m1/ninja_log.build128-m1.chrome-bot.20161123-183352.4372.gz/trace.html

The slowdown on the bot is pretty localized in v8 TUs.

faster.png
217 KB View Download
slower.png
264 KB View Download

Comment 4 by thakis@chromium.org, Nov 28 2016

The build does include a v8 roll: https://codereview.chromium.org/2529633002

Comment 5 by r...@chromium.org, Nov 28 2016

Anecdotally, I noticed that building v8 was very slow while reproducing the PCH crasher, so I totally support this diagnosis.

Comment 6 by thakis@chromium.org, Nov 28 2016

I tried building v8_shell with clang 287685 (which is from before the clang regression window), and it took 21m49s on a fast desktop.

C:\src\chrome\src>type out\gn\args.gn
is_clang = true
target_cpu = "x86"


I'm trying with an older chrome revision now (currently at #434676)

Comment 7 by thakis@chromium.org, Nov 28 2016

At #434180, building v8_shell from scratch takes a more reasonable 4m8.7s.

Comment 8 by thakis@chromium.org, Nov 28 2016

At that same revision with just the v8 roll https://codereview.chromium.org/2529633002 merged, building v8_shell takes forever. So it's due to that somehow. I looked through the commits, they all look fairly harmless. I'll bisect over v8 commits.

Comment 9 by thakis@chromium.org, Nov 28 2016

C:\src\chrome\src\v8>git bisect bad
491651792d7818aed04eaeffb9890b5a309b543e is the first bad commit
commit 491651792d7818aed04eaeffb9890b5a309b543e
Author: cbruni <cbruni@chromium.org>
Date:   Wed Nov 23 04:22:49 2016 -0800

    [counters] RuntimeStats: fix wrong bookkeeping when dynamically changing counters

    RuntimeTimerScopes always subtract their own time from the parent timer's
    counter to properly account for the own time. Once a scope is destructed it
    adds it own timer to the current active counter. However, if the current
    counter is changed with CorrectCurrentCounterId we will attribute all the
    subtimers to the previous counter, and add the own time to the new counter.
    This way it is possible to end up with negative times in certain counters but
    the overall would still be correct.

    BUG=

    Committed: https://crrev.com/f6c74d964d9387df4bed3d8c1ded51eb9e8aa6e8
    Review-Url: https://codereview.chromium.org/2511093002
    Cr-Original-Commit-Position: refs/heads/master@{#41142}
    Cr-Commit-Position: refs/heads/master@{#41214}

:040000 040000 c3a4bc825561ced16584253b35bda97d5b4e3d28 c95e8c613190d61927af582a02c60e5acb56ede6 M      src
:040000 040000 cf6f8529c4192d3a89f2f09df525fc56ff1a635e d2de6fbc0ad28086b01e52c846acbdd7298c8f25 M  test
Also repros with cl.exe (target_cpu="x86")
Cc: ishell@chromium.org machenb...@chromium.org cbruni@chromium.org l...@chromium.org
Summary: Huge compile time regression on Windows due to v8 change (full build up to 25% slower, v8_shell build up to 400% slower) (was: Huge compile time regression on ClangToTWin(dbg))
Also in x64 builds. https://codereview.chromium.org/2511093002 killed local build performance at least on Windows, in the default build configuration (repro: clean build of v8_shell before and after that commit). I haven't tested on other platforms yet.

Comment 12 by r...@chromium.org, Nov 28 2016

The change to counters.h in that CL adds dllexport to a class that has 4 * 186 or so members and an inline constructor. That constructor will be large and its getting emitted everywhere. Maybe they can solve the problem if they sink it to the .cc file.
Can we revert the roll until this is fixed?
Components: Infra>Client>V8
I built before and after that change on Linux (without goma), no change in built time there. So this is Windows-specific, likely due to dllexport somehow. Just moving the RunTimeCallStats ctor out of line didn't help much though.

Comment 15 by h...@chromium.org, Nov 29 2016

> Just moving the RunTimeCallStats ctor out of line didn't help much though.

Because of dllexport, the assignment (and move assignment I think) operator will be emitted. Does marking that deleted help?
@cbruni: Can we revert the v8-side change f6c74d96 and see if things improve?

Our win-clang bot shows no sign of regression in the same revision:
https://build.chromium.org/p/client.v8/builders/V8%20Win64%20-%20clang/builds/3681

But our build deps are lagging behind quite a lot atm.
Owner: cbruni@chromium.org
Status: Assigned (was: Untriaged)
It's an isolated CL, so revert should work without a problem, however, we will screw up once more our startup graphs :/
Status: Started (was: Assigned)
Project Member

Comment 20 by bugdroid1@chromium.org, Nov 29 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/v8/v8.git/+/95c0ecee66786108a926645d37bba4d8c8a1f258

commit 95c0ecee66786108a926645d37bba4d8c8a1f258
Author: cbruni <cbruni@chromium.org>
Date: Tue Nov 29 11:57:15 2016

[counters] Avoid V8_EXPORT_PRIVATE to speed up compilation under windows

BUG= chromium:668748 

Review-Url: https://codereview.chromium.org/2534123002
Cr-Commit-Position: refs/heads/master@{#41351}

[modify] https://crrev.com/95c0ecee66786108a926645d37bba4d8c8a1f258/src/counters.h

Roll with fix attempt on the way:
https://codereview.chromium.org/2540723002/
Machenbach: the bot you linked to probably uses goma, so it's less visible there (high sharding + cache)
I just built at 70f9ea0 (after that roll) and I think v8 feels back to normal. My machine was totally bogged down and gasping for air in previous builds, but it seems generally fine now. (I don't have any quantitative numbers though, sorry.)
Can confirm same with a build based on r435038, about 40 mins less spent on target blink_tests compared to a ToT build from earlier today.
Status: Fixed (was: Started)
https://build.chromium.org/p/chromium.fyi/stats/ClangToTWin(dbg) seems to be recovering as well. For relanding, Hans's suggestion in comment 15 may or may not help.
I tried Hans's suggestion locally, and it seems to help. cbruni, with this on top of your patch, builds are still fast locally:


C:\src\chrome\src\v8>git diff
diff --git a/src/counters.cc b/src/counters.cc
index 0468d0f..2eb9d0e 100644
--- a/src/counters.cc
+++ b/src/counters.cc
@@ -303,6 +303,11 @@ void RuntimeCallTimer::Snapshot() {
   Resume(now);
 }

+RuntimeCallStats::RuntimeCallStats() {
+  Reset();
+  in_use_ = false;
+}
+
 // static
 const RuntimeCallStats::CounterId RuntimeCallStats::counters[] = {
 #define CALL_RUNTIME_COUNTER(name) &RuntimeCallStats::name,
diff --git a/src/counters.h b/src/counters.h
index 089f024..767f8ed 100644
--- a/src/counters.h
+++ b/src/counters.h
@@ -854,10 +854,8 @@ class V8_EXPORT_PRIVATE RuntimeCallStats final : public ZoneObject {
   void Print(std::ostream& os);
   V8_NOINLINE void Dump(v8::tracing::TracedValue* value);

-  RuntimeCallStats() {
-    Reset();
-    in_use_ = false;
-  }
+  RuntimeCallStats();
+  void operator=(RuntimeCallStats&) = delete;

   RuntimeCallTimer* current_timer() { return current_timer_.Value(); }
   bool InUse() { return in_use_; }


If you go with this, insert a good comment :-)
Setting up RuntimeCallStats is not performance critical, hence this change should be fine.

Sign in to add a comment