New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 864309 link

Starred by 10 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 28
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

autotest CLs cause 700+ packages to be build, resulting in major CQ slowdown

Project Member Reported by akes...@chromium.org, Jul 17

Issue description

As of https://b.corp.google.com/issues/75396373 , ebuilds are being uprevved whenever an ebuild they depend on is changed. This means we rebuild a lot more packages from source than we used to.

For example, a tryjob I launched containing an autotest CL stack https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8940829691880334304 resulted in 763 packages being rebuilt, which extended BuildPackages stage to 1 hour. 

Much worse is the impact on the CQ. Metrics such as http://shortn/_vD9z3Wc2BR indicate a bimodal behavior in the time taken in build_packages on paladins -- during the week, pretty much every CQ run will have at least 1 autotest CL, so builds almost always take 1+ hours (and worse in the tail, and the CQ is tail-sensitive).

In the past, BuildPackages took ~20-30 minutes in the CQ in the common case.

This behavior is probably because there are a number of ebuilds that use the autotest repo as cros working uprevver, and 1 of those is probably very deep in the dependency graph.

Still open questions:
 - Which autotest related ebuild is the super deep on in the dep graph.
 - Which other commonly modified repos might also be triggering uprevs that are deep in the dep graph.

Possible mitigations to investigate:
 1) Change ebuild dependencies so the offending autotest ebuild(s) are not so deep in the graph.

 2) Change those ebuild's uprev behavior (using SUBDIRS_TO_UPREV, or more advanced methods) so that not every autotest ebuild gets uprevved all the time. [Note: this idea has a long history and was problematic at times, but might be more predictable in this always-build-reverse-deps world, see  Issue 655884  Issue 791888 ]

    2a) Split the autotest repo up so that not so many ebuilds are pointing to a single high traffic repo.

 3) Revert the behavior from https://b.corp.google.com/issues/75396373 and achieve the needed uprevs there by some other means.

 4+) Your ideas here.
 
Labels: -Restrict-View-Google
Owner: gmeinke@chromium.org
looks like the over zealous forced reverse deps build logic.  i still think that was all a mistake.  discussed a bit in  issue 765897 .
For now, we don't really have any other solution to the unibuild config ebuilds breaking on the builders (the original motivation for the revdeps change). So, I think for this specific example of a performance regression, we should look at why so many things depend on Autotest.

Is there a convenient tool to dump the ebuild dependency graph, so we can inspect it?
Status: Assigned (was: Untriaged)
`equery depgraph` for forward dependencies, `equery depends --indirect` for reverse dependencies. The latter doesn't really give you a graph, but you could slowly build one by calling it recursively (without --indirect).
Depending on your definition of "forward" and "reverse" you might want to reverse "forward" and "reverse" in that comment...
Owner: yshaul@chromium.org
-> yshaul is taking a crack at investigating this.
Cc: davidri...@chromium.org
Another example was the builds kicked off in the pre-cq build for: https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/1140750

This is an ebuild change which I don't think should even be included or built by many things at this point, so I would expect build_packages to be all from prebuilts.  For some reason https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/emergePackageDetails?cidbBuildId=2758925 took twice as long as the average and rebuilt everything (and did not build virglrenderer).

One thing that you can do is use the build json file and walk back from the last package to the one that finished before it started, and so-on to get the entire critical path for a build.
Timeout building packages: https://luci-milo.appspot.com/buildbot/chromiumos/moblab-generic-vm-paladin/2123

Maybe instead of rebuilding every package to solve the cros config issue, we tag the specific packages that statically link against another dep, and only rebuild those statically linked packages.
 Issue 865258  has been merged into this issue.
Labels: -Pri-2 Pri-1
Owner: gmeinke@chromium.org
Based on repo case in  issue 865258 , I now believe that this is a bug in the revdeps calculation and not actually an issue in with the depgraph.
After a bit more digging and consulting our portage expert, it appears that the calculation of the packages to consider for reverse dependencies calcs are too aggressive and we will remove all packages that are not considered upgrades from that list.


Is there an update to this issue?
Nothing yet, it has gotten pushed way down my priority stack.
Isn't this causing major developer productivity hit because local builds aren't using prebuilts properly? 
Until the higher priority issues are fixed it unfortunately is on the back
burner, without it builds don't properly uprev config changes. Building
locally you can always use --nowithrevdeps.
There was a bug related to the calculation that was fixed after this bug was filed. Can't find the bug reference now. I'm pretty sure that the original request related to Autotest has been fixed. At the moment, we believe that the calculation is 100% correct and the only thing that might still be tracked by this bug is a general optimization strategy.

I'm inclined to just close this.

Can we verify how many packages are rebuilt when an autotest CL is included?
Hey, if you mark this as fixed, please dedup  issue 865258 . I still see ALL packages (except chromeos-chrome) rebuilt from source on clean local rebuild. It's severely affecting our productivity. Full logs attached.

$ ./setup_board --board=betty --force && ./build_packages --board=betty
...
Total: 783 packages (779 new, 4 reinstalls, 1 binary), Size of downloads: 1686257 KiB


build.txt
369 KB View Download
Cc: hidehiko@chromium.org
And please check if the problem reproduces in your local checkout. Maybe it's a problem specific to my checkout. But I heard hidehiko@ also saw this issue.

> $ ./setup_board --board=betty --force && ./build_packages --board=betty

I just ran this and got:
Total: 783 packages (779 new, 4 reinstalls, 1 binary), Size of downloads: 1975785 KiB

Yet, during the simulation a little higher, we see:
Total: 779 packages (779 new, 629 binaries), Size of downloads: 4215301 KiB

The logic that was added for revdeps is only adding a few additional packages: 
INFO    : final reverse dependencies that will be rebuilt: sys-kernel/linux-headers sys-libs/gcc-libs sys-libs/libcxx sys-libs/libcxxabi

Something is completely disabling prebuilts on eng desktops in the Portage final invocation. I'm not sure if it's related to this bug.

If we look at a coral-paladin builder from over the weekend a smallish number of changes, we see something like for a final calculation:
Total: 632 packages (155 upgrades, 1 downgrade, 1 new, 475 reinstalls, 1 binary), Size of downloads: 586098 KiB

Re c#20: Can we spin off a bug for the prebuilts issue and find an owner for it?

Cc: llozano@chromium.org sque@chromium.org bjanakiraman@chromium.org manojgupta@chromium.org
Regarding #20, it is not just eng desktops. CQ is rebuilding almost everything resulting in atleast an extra hour or more spent in build_packages.

e.g. https://uberchromegw.corp.google.com/i/chromeos/builders/bob-paladin/builds/3855 on 08/22

Looking at build_packages stage, Only 1 package (Chrome) is used as prebuilt: 
Total: 785 packages (781 new, 4 reinstalls, 1 binary), Size of downloads: 598482 KiB
I agree that something is going on on the CQ but it's somehow different from what's happening on Eng desktops and not as simple as you state: http://shortn/_FI4EF9C5i4 . We can see that, across all of the data, on weekends when there are very few changes, very few packages are being rebuilt and prebuilts are being used. In contrast, a fresh checkout on an Eng desktop is not using any prebuilts. We need someone on the Build team to investigate both issues.

With regard to the revert suggested in (3) of the original report, note that reverting the changes in https://b.corp.google.com/issues/75396373 would make that bug come back which would arguably be worse: any change to a cros-config related package would work fine locally but then fail on the PreCQ or CQ (depending on the chroot state) because of the differences between local cros_workon's treatment of dependencies (ignored and always rebuilt if workon'd) and the on-CQ uprev logic which only concerns itself with packages that are actually uprev'd (and therefore must rely on depgraph analysis to solve deps problems). Those bugs are very frustrating for those trying to land CL's and for the CQ runs they break: the author is left speculating about what the depgraph on the chroot of the failure looks like.

Also, the current behavior, even in its aggressive form, is also blocking similar "spooky action at a distance" bugs introduced by CL's that would only manifest when the reverse deps are rebuilt later: so-called N+1 breakages.

Owner: vapier@chromium.org
I believe I have a good work-around for this that will still preserve the necessary upgreving and stop building all of the unnecessary packages. CL should be ready by EOD.
Owner: gmeinke@chromium.org
Cc: allenwebb@google.com
Fix has been in CQ since early Friday afternoon, no luck landing it just yet.
Project Member

Comment 30 by bugdroid1@chromium.org, Aug 27

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crosutils/+/0766df471295c1b386b0bb284327ce9a374d98f2

commit 0766df471295c1b386b0bb284327ce9a374d98f2
Author: Gregory Meinke <gmeinke@google.com>
Date: Mon Aug 27 19:14:11 2018

build_packages with fewer revdeps

Cleaning up reverse dependency calculations. Removing saving
the install plan currently calculated in parallel_emerge.py
with a grep of the emerge simulation output. This will only
pickup the changes that are to ebuilds instead of the previously
incorrect reverse dependencies on the binary packages.

Will cleanup the rest of the install plan code in another CL once
this CL is working correctly locally and in the CQ.

BUG= chromium:864309 
TEST=local builds

Change-Id: Id77ad050679fb701d2b455e462fc91ba1c09d0ff
Reviewed-on: https://chromium-review.googlesource.com/1187194
Commit-Queue: Gregory Meinke <gmeinke@chromium.org>
Tested-by: Gregory Meinke <gmeinke@chromium.org>
Trybot-Ready: Gregory Meinke <gmeinke@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>
Reviewed-by: Gregory Meinke <gmeinke@chromium.org>

[modify] https://crrev.com/0766df471295c1b386b0bb284327ce9a374d98f2/build_packages

Status: Fixed (was: Assigned)
this should be resolved now.  if we have new data to show it's not working out, please post.
Project Member

Comment 32 by bugdroid1@chromium.org, Aug 30

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/1756533565918f29ad948d5a5ebd0aec17977caa

commit 1756533565918f29ad948d5a5ebd0aec17977caa
Author: Gregory Meinke <gmeinke@chromium.org>
Date: Thu Aug 30 08:27:57 2018

cbuildbot: remove saving emerge install plan

New reverse dependency calculation does not depend on the
parallel emerge install plan being saved to disk. Remove
from cbuildbot and then will remove from the remaining
shell scripts and parallel emerge.

BUG= chromium:864309 
TEST=local unittests, local builds and tryjobs

Change-Id: Ia01573c368b028d02e4acc522bc053a87e4533a7
Reviewed-on: https://chromium-review.googlesource.com/1190424
Commit-Ready: Gregory Meinke <gmeinke@chromium.org>
Tested-by: Gregory Meinke <gmeinke@chromium.org>
Reviewed-by: Gregory Meinke <gmeinke@chromium.org>
Reviewed-by: C Shapiro <shapiroc@google.com>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/1756533565918f29ad948d5a5ebd0aec17977caa/cbuildbot/stages/build_stages.py
[modify] https://crrev.com/1756533565918f29ad948d5a5ebd0aec17977caa/cbuildbot/commands.py

Project Member

Comment 33 by bugdroid1@chromium.org, Aug 30

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crosutils/+/0206e1b50738a8340864da7a73ee833b65fa3f62

commit 0206e1b50738a8340864da7a73ee833b65fa3f62
Author: Gregory Meinke <gmeinke@chromium.org>
Date: Thu Aug 30 08:27:57 2018

remove install plan from build scripts

Remove the flags for saving the emerge install plan from build scripts.
Saving this incorrect data to an intermediate file is no longer
necessary since the new reverse dependency calculation code is already
submitted.

BUG= chromium:864309 
TEST=local builds and tryjobs
CQ-DEPEND=CL:1190424

Change-Id: I405400bbdef0d8c88b4877eb1fab38dea5af8217
Reviewed-on: https://chromium-review.googlesource.com/1191762
Commit-Ready: Gregory Meinke <gmeinke@chromium.org>
Tested-by: Gregory Meinke <gmeinke@chromium.org>
Reviewed-by: Gregory Meinke <gmeinke@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/0206e1b50738a8340864da7a73ee833b65fa3f62/setup_board
[modify] https://crrev.com/0206e1b50738a8340864da7a73ee833b65fa3f62/update_chroot
[modify] https://crrev.com/0206e1b50738a8340864da7a73ee833b65fa3f62/build_packages

Cc: zentaro@chromium.org
Project Member

Comment 35 by bugdroid1@chromium.org, Aug 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/68de241167e09532d634d582115d770a73868052

commit 68de241167e09532d634d582115d770a73868052
Author: Gregory Meinke <gmeinke@chromium.org>
Date: Fri Aug 31 18:20:16 2018

parallel_emerge: remove saving of install plan

Remove the unused code to save the emerge install plan to
disk for the old reverse dependency calculations.

BUG= chromium:864309 
TEST=local builds and tryjobs

Change-Id: Ica8edd78c9bc97c996f17570f5c7042c143678fd
Reviewed-on: https://chromium-review.googlesource.com/1196823
Commit-Ready: Gregory Meinke <gmeinke@chromium.org>
Tested-by: Gregory Meinke <gmeinke@chromium.org>
Reviewed-by: Gregory Meinke <gmeinke@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/68de241167e09532d634d582115d770a73868052/scripts/parallel_emerge.py

Sign in to add a comment