New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 876316 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Aug 31
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 790075

Blocking:
issue 875881



Sign in to add a comment

Android Nexus5X WebView Perf builder is behind TOT build 6 days

Project Member Reported by nednguyen@chromium.org, Aug 21

Issue description

https://ci.chromium.org/buildbot/chromium.perf/Android%20Nexus5X%20WebView%20Perf/2438 is built today (8/21), but the commits in the builds are from 8/15, which is 6 days ago.

Visually, one can see this is very behind by looking at webview n5x column in https://ci.chromium.org/p/chromium/g/chromium.perf/console

This is bad because it means all perf regressions this builder could catch are at least 6, 7 days ago.
 
Blocking: 875881
That's because its parent builder has 1500+ pending builds:
https://ci.chromium.org/buildbot/chromium.perf/Android%20arm64%20Compile%20Perf/

I'd restart the master if I were you. That should clear its queue.
And also configure it to collate pending requests.
#3: how do I do do it? I would reconfigure before restarting the master :-)
Cc: -jbudorick@google.com jbudorick@chromium.org
We intentionally do not collate pending requests, because we may go back and re-run old builds. Manual bisection may also need the old builds to be available.

Can we add additional capacity and let the builders catch up on their own? Follow-up question: did something change to put the builders over capacity?
Looks like "package build" is taking 2 min on 32-bit, but 35 min on arm64.
Cc: g...@chromium.org
Owner: dtu@chromium.org
Status: Assigned (was: Untriaged)
+gbiv: I see a lot of these files in the arm64 package. Looks like the builder is leaking these ThinLTO cache files.

  adding: full-build-linux/android_clang_arm/thinlto-cache/llvmcache-0695C0E95CC29FBEB4E02B6C7FF2593DEE7BB4E7 (deflated 64%)
  adding: full-build-linux/android_clang_arm/thinlto-cache/llvmcache-61B696E990657263ECFF4ABF82764A91ED7B7F15 (deflated 69%)
  adding: full-build-linux/android_clang_arm/thinlto-cache/llvmcache-108E1DABBE735416364AA7A173644F0D3BC956BA (deflated 65%)


@bpastene: can we clobber these files and also get more hardware to catch up?
Cc: p...@chromium.org
+pcc has more context, but my 2c:

- AIUI, we currently only have ThinLTO enabled on arm64 (it's off for arm32 due to a bug), so the difference in build times makes sense.
  - If the issue is "we don't have enough CPU," we'll probably want to ensure there's enough capacity to handle it when arm32 is back on.
    - (Also, I plan on making the actual ThinLTO link step take 2xish longer. So if CPU is the blocker here, we'll definitely need to coordinate before I make that flag flip) 
- if ThinLTO is all that makes arm64 take 2-3x longer than arm32, that's quite surprising to me. On my z840, a fresh thin link of libchrome.so takes nowhere near 20 minutes of wall time.
- we try to limit the number of cache files that we keep around, though it's expected that there will be quite a few of them hanging out in the build directory [1].
  - depending on what these builders are doing, it may be beneficial to treat the cache differently (e.g. if they're not incremental builders, turn the cache off. If they are incremental, and if it's a noticeable slowdown, maybe we could exclude $out/thinlto-cache from the zip files we're making?)

[1] - https://cs.chromium.org/chromium/src/build/config/compiler/BUILD.gn?q=file:build.gn+file:compiler+thinlto&sq=package:chromium&g=0&l=633
> maybe we could exclude $out/thinlto-cache from the zip files we're making?

Yes. We're already doing that for the main thinlto-cache directory because of https://chromium-review.googlesource.com/544530 but it looks like that's not covering android_clang_arm/thinlto-cache for some reason. While at it we should probably exclude android_clang_arm/obj (etc.) in the same way that we're excluding the main obj directory.
A more detailed breakdown:

arm32 takes ~15 minutes, of which:
-  3 min bot_update
-  4 min compile
-  2 min isolate tests
-  3 min package build

arm64 takes ~70 minutes, of which:
-  3 min bot_update
- 28 min compile
-  2 min isolate tests
- 32 min package build

So pcc's change (c13) to exclude the cache from the zip files should save about 30 minutes. But sounds like compile does take significantly longer than expected as well.
Project Member

Comment 15 by bugdroid1@chromium.org, Aug 22

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2

commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2
Author: Peter Collingbourne <pcc@google.com>
Date: Wed Aug 22 05:44:36 2018

zip_build: Apply the path filter recursively when creating a package archive.

Previously we would only apply it to the direct descendants of the
build directory and not the contents of any of its subdirectories. This
meant that, for example, we ended up packaging the 'obj' and
'thinlto-cache' directories from the build directories for toolchains
other than the default toolchain, despite these directories being
filtered out.

We previously had code that handled the 'initial' directory as a
special case where the filter was also being applied to the files
in that directory. It seems like this code is now dead because the
directory is now named 'initialexe' (?), but in any event, since we
now have a generalization of that code, I've removed it.

Bug:  876316 
Change-Id: Ieab27788bf3ca7c7bf970434ac491a053eaa5baf
Reviewed-on: https://chromium-review.googlesource.com/1184302
Reviewed-by: Nico Weber <thakis@chromium.org>
Commit-Queue: Peter Collingbourne <pcc@chromium.org>

[modify] https://crrev.com/dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2/scripts/slave/zip_build.py
[modify] https://crrev.com/dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2/scripts/common/chromium_utils.py

Components: Infra>Labs
Even with the fix in #15, the "Android arm64 Compile Perf" builder is still very slow and cannot keep up. 

vhang@ is that possible to add 11 more slave*-c1 bots for perf builders to keep up with the build load? ("Android Compile Perf" has 10 bots and it build is roughly half of the time of ""Android arm64 Compile Perf"" builder)
Owner: vhang@chromium.org
Assign this bug to Vince to get 11 more slave*-c1 bots for perf builders. Spec would be similar to slave{180-185}-c1
Owner: fried...@google.com
Hi Elliot, if you haven't worked on this bug, maybe pause on this. I maybe trying to switch "Android Nexus5X WebView Perf" to use the LUCI buildbot builder which don't have this capacity problem. If that go well, we can just close this bug 
Project Member

Comment 20 by bugdroid1@chromium.org, Aug 25

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/84db4501034bc6a30956e4cdf96568b56383845b

commit 84db4501034bc6a30956e4cdf96568b56383845b
Author: Nghia Nguyen <nednguyen@google.com>
Date: Sat Aug 25 08:10:58 2018

Convert 'Android Nexus5X WebView Perf' to use 'android_arm64-builder-perf' as its parent builder

Bug:876316,828465
Change-Id: I520af132cddaa14ac85b5f75579e0f327cecb586

Recipe-Manual-Change: build_limited_scripts_slave
Recipe-Manual-Change: release_scripts

TBR=dtu@chromium.org

Change-Id: I520af132cddaa14ac85b5f75579e0f327cecb586
Reviewed-on: https://chromium-review.googlesource.com/1188824
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Ned Nguyen <nednguyen@google.com>

[modify] https://crrev.com/84db4501034bc6a30956e4cdf96568b56383845b/scripts/slave/recipe_modules/chromium_tests/chromium_perf.py

Project Member

Comment 21 by bugdroid1@chromium.org, Aug 25

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/df8681034f0a523a68722bdcf436f0caf377ea8b

commit df8681034f0a523a68722bdcf436f0caf377ea8b
Author: Ned Nguyen <nednguyen@google.com>
Date: Sat Aug 25 10:28:46 2018

Revert "Convert 'Android Nexus5X WebView Perf' to use 'android_arm64-builder-perf' as its parent builder"

This reverts commit 84db4501034bc6a30956e4cdf96568b56383845b.

Reason for revert: failed because 'Android Nexus5X WebView Perf' in chromium project, not chrome project (https://logs.chromium.org/logs/chrome/buildbucket/cr-buildbucket.appspot.com/8937230052059012480/+/steps/trigger/0/stdout)

Original change's description:
> Convert 'Android Nexus5X WebView Perf' to use 'android_arm64-builder-perf' as its parent builder
> 
> Bug:876316,828465
> Change-Id: I520af132cddaa14ac85b5f75579e0f327cecb586
> 
> Recipe-Manual-Change: build_limited_scripts_slave
> Recipe-Manual-Change: release_scripts
> 
> TBR=dtu@chromium.org
> 
> Change-Id: I520af132cddaa14ac85b5f75579e0f327cecb586
> Reviewed-on: https://chromium-review.googlesource.com/1188824
> Reviewed-by: Ned Nguyen <nednguyen@google.com>
> Commit-Queue: Ned Nguyen <nednguyen@google.com>

TBR=nednguyen@google.com,jbudorick@chromium.org

Change-Id: I109cf309a8677e5f0bfd1c64f36f7c4bf587f25f
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  876316 , 828465
Reviewed-on: https://chromium-review.googlesource.com/1188828
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Ned Nguyen <nednguyen@google.com>

[modify] https://crrev.com/df8681034f0a523a68722bdcf436f0caf377ea8b/scripts/slave/recipe_modules/chromium_tests/chromium_perf.py

Project Member

Comment 22 by bugdroid1@chromium.org, Aug 27

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/9a41d66725951f98d7c3a425e418ca8155abf00e

commit 9a41d66725951f98d7c3a425e418ca8155abf00e
Author: Patrik Höglund <phoglund@chromium.org>
Date: Mon Aug 27 10:16:16 2018

Revert "zip_build: Apply the path filter recursively when creating a package archive."

This reverts commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2.

Reason for revert: Speculative revert: looks like this breaks WebRTC build packaging for browser_tests.

Original change's description:
> zip_build: Apply the path filter recursively when creating a package archive.
>
> Previously we would only apply it to the direct descendants of the
> build directory and not the contents of any of its subdirectories. This
> meant that, for example, we ended up packaging the 'obj' and
> 'thinlto-cache' directories from the build directories for toolchains
> other than the default toolchain, despite these directories being
> filtered out.
>
> We previously had code that handled the 'initial' directory as a
> special case where the filter was also being applied to the files
> in that directory. It seems like this code is now dead because the
> directory is now named 'initialexe' (?), but in any event, since we
> now have a generalization of that code, I've removed it.
>
> Bug:  876316 
> Change-Id: Ieab27788bf3ca7c7bf970434ac491a053eaa5baf
> Reviewed-on: https://chromium-review.googlesource.com/1184302
> Reviewed-by: Nico Weber <thakis@chromium.org>
> Commit-Queue: Peter Collingbourne <pcc@chromium.org>

TBR=thakis@chromium.org,dpranke@chromium.org,pcc@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug:  876316 ,  876743 
Change-Id: Ie229755a4836e6e80d896801535b3d5ad8a98650
Reviewed-on: https://chromium-review.googlesource.com/1189962
Commit-Queue: Patrik Höglund <phoglund@chromium.org>
Reviewed-by: Patrik Höglund <phoglund@chromium.org>

[modify] https://crrev.com/9a41d66725951f98d7c3a425e418ca8155abf00e/scripts/slave/zip_build.py
[modify] https://crrev.com/9a41d66725951f98d7c3a425e418ca8155abf00e/scripts/common/chromium_utils.py

Blockedon: 790075
#21 doesn't work due to LUCI builder was in "chrome" project, whereas buildbot builder "Android Nexus5X WebView Perf" is in "chromium". 

I will instead trying to convert "Android Nexus5X WebView Perf" to LUCI instead
*Good news here is our LUCI builder version of ' Android arm64 Compile Perf' is able to catch up with new builds.

See 'android_arm64-builder-perf' in https://ci.chromium.org/p/chrome/g/perf-side-by-side/console
Project Member

Comment 25 by bugdroid1@chromium.org, Aug 28

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/d550bb44e12759d593d991a7a6f916af5d90b414

commit d550bb44e12759d593d991a7a6f916af5d90b414
Author: Peter Collingbourne <pcc@google.com>
Date: Tue Aug 28 16:47:23 2018

Reland "zip_build: Apply the path filter recursively when creating a package archive."

This relands commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2.

The original commit introduced a bug where symlinks that point to
directories were omitted from the archive. The cause was that for
some reason os.walk returns symbolic links that point to directories
in dirs and not files, so we need to handle any symlinks in dirs as
if they appeared in files instead.

Original change's description:
> Revert "zip_build: Apply the path filter recursively when creating a package archive."
>
> This reverts commit dc454099775dcec7ba4c31dd0fb0eb4c9447b9e2.
>
> Reason for revert: Speculative revert: looks like this breaks WebRTC build packaging for browser_tests.
>
> Original change's description:
> > zip_build: Apply the path filter recursively when creating a package archive.
> >
> > Previously we would only apply it to the direct descendants of the
> > build directory and not the contents of any of its subdirectories. This
> > meant that, for example, we ended up packaging the 'obj' and
> > 'thinlto-cache' directories from the build directories for toolchains
> > other than the default toolchain, despite these directories being
> > filtered out.
> >
> > We previously had code that handled the 'initial' directory as a
> > special case where the filter was also being applied to the files
> > in that directory. It seems like this code is now dead because the
> > directory is now named 'initialexe' (?), but in any event, since we
> > now have a generalization of that code, I've removed it.
> >
> > Bug:  876316 
> > Change-Id: Ieab27788bf3ca7c7bf970434ac491a053eaa5baf
> > Reviewed-on: https://chromium-review.googlesource.com/1184302
> > Reviewed-by: Nico Weber <thakis@chromium.org>
> > Commit-Queue: Peter Collingbourne <pcc@chromium.org>
>
> TBR=thakis@chromium.org,dpranke@chromium.org,pcc@chromium.org
>
> # Not skipping CQ checks because original CL landed > 1 day ago.
>
> Bug:  876316 ,  876743 
> Change-Id: Ie229755a4836e6e80d896801535b3d5ad8a98650
> Reviewed-on: https://chromium-review.googlesource.com/1189962
> Commit-Queue: Patrik Höglund <phoglund@chromium.org>
> Reviewed-by: Patrik Höglund <phoglund@chromium.org>

Bug:  876316 ,  876743 
Change-Id: Ia6c329c797374e24d6bcf43d0cadd6517deb27b8
Reviewed-on: https://chromium-review.googlesource.com/1192234
Reviewed-by: Nico Weber <thakis@chromium.org>
Commit-Queue: Peter Collingbourne <pcc@chromium.org>

[modify] https://crrev.com/d550bb44e12759d593d991a7a6f916af5d90b414/scripts/slave/zip_build.py
[modify] https://crrev.com/d550bb44e12759d593d991a7a6f916af5d90b414/scripts/common/chromium_utils.py

Owner: nednguyen@chromium.org
Status: Fixed (was: Assigned)
The LUCI builder "Android Nexus5X WebView Perf " is function normally and not falling behind (e.g: https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/Android%20Nexus5X%20WebView%20Perf/2556), so there is no need to add more hardware for the buildbot builder :-)

(work addressing this bug can be traced in issue 790075)

Sign in to add a comment