New issue
Advanced search Search tips

Issue 609886 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

ccompute images should have embedded checkouts.

Project Member Reported by dgarr...@chromium.org, May 6 2016

Issue description

When we bring up new GCE images, they have to do fresh checkouts. It's very easy for even 10 servers to exhaust our GoB bandwidth quotas.

So.... we should checkout ChromeOS code (and Chrome code) into our ccompute  images so that they always start with a warm cache.
 
This issue has caused a LOT of manual work and disruption during the ungrouping of release builders.
I'll work out the exact directories and sync commands needed.

After they are known, what work is needed to get these files onto the images?
To fetch the internal ChromeOS checkout:

In "/b/cbuild/internal_master"

repo init
  --repo-url https://chromium.googlesource.com/external/repo
  --manifest-url https://chrome-internal-review.googlesource.com/chromeos/manifest-internal
  --manifest-name official.xml
  --manifest-branch master

repo sync

To fetch the external ChromeOS checkout:

In "/b/cbuild/external_master"

repo init
  --repo-url https://chromium.googlesource.com/external/repo
  --manifest-url https://chromium.googlesource.com/chromiumos/manifest
  --manifest-name default.xml
  --manifest-branch master

repo sync

Chrome code ends up in "/b/cbuild/internal_master/.cache/distfiles/target/chrome-src-internal". But the gclient config appears to be a bit custom to the build.

https://uberchromegw.corp.google.com/i/chromeos/builders/kip-release/builds/78/steps/SyncChrome/logs/stdio

Comment 6 by d...@chromium.org, May 6 2016

The ccompute-side work would be simple: Add one or more commands to a CrOS-specific section of https://chrome-internal.googlesource.com/infra/infra_internal/+/master/ccompute/images/new_image.py .

Comment 7 by sosa@google.com, May 6 2016

Which quotas did we hit? Did we hit the burst quota or the daily quota? If the former, we could also just stagger roll-outs, which we've traditionally done when bringing up GCE instances. Agreed syncing more intelligently is likely the only thing we can do to fix the latter.
We managed to hit both, we hit the burst quota last night during the initial rollout, and hit the daily limit at some point in the middle of the night.

I did stagger the initial roll out to 10 machines every 30 minutes, but I was watching the ChromeOS sync, not the Chrome sync. Seems a cold Chrome sync takes about 45 minutes and threw my schedule off.
Status: Assigned (was: Untriaged)

Comment 11 by hinoka@google.com, May 10 2016

Also mentioned offline: I think the way to do this is just to check in a script somewhere (anywhere, though it'd be easiest if it was in a public repo) that's like bootstrap_cache.py, which takes a --target-dir, and then download/call that script from ccompute/images/scripts/setup.py.
Labels: -current-issue
Labels: -Pri-2 Pri-1
Status: Started (was: Assigned)
Can add ChromeOS specific support after this lands:

https://chromereviews.googleplex.com/421367013/
Cc: akes...@chromium.org
The current plan is that we will create images with a repo checkout in a well known location. Work steps:

1) Tweak "setup_cache.py" to fetch ChromeOS to a well known location.
2) Add "--buildbot-warm-cache-path" to cbuildbot.
  * If there is no repo checkout, cbuildbot will attempt to create it by doing
    a copy from that path before doing its initial sync.
  * If there is an existing checkout, that path will be ignored.
  * It's acceptable for the specified path to not exist.
3) Tweak the cbuildbot recipe to pass that path, but only for branches with the
   new arg.
Project Member

Comment 15 by bugdroid1@chromium.org, May 27 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal.git/+/8233ca2e269e449882183e76bfb8b1cb594eb3eb

commit 8233ca2e269e449882183e76bfb8b1cb594eb3eb
Author: dnj <dnj@google.com>
Date: Fri May 27 22:28:26 2016

Project Member

Comment 16 by bugdroid1@chromium.org, Jun 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/0af6ba52f3943e09ee84eb90954ffbf4ddcd086b

commit 0af6ba52f3943e09ee84eb90954ffbf4ddcd086b
Author: Don Garrett <dgarrett@google.com>
Date: Fri May 27 21:53:31 2016

cbuildbot: Add --repo-cache option.

Give the builders an option to give us a warm repo cache to copy in if
we need to create our build root from scratch. This allows us to avoid
overloading GoB when we bring up large numbers of new builders.

BUG= chromium:609886 
TEST=run_tests

Change-Id: Ib79f219554b4a8e0e256f366916c376d716025d0
Reviewed-on: https://chromium-review.googlesource.com/348011
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/0af6ba52f3943e09ee84eb90954ffbf4ddcd086b/scripts/cbuildbot.py
[modify] https://crrev.com/0af6ba52f3943e09ee84eb90954ffbf4ddcd086b/cbuildbot/stages/sync_stages.py

Project Member

Comment 18 by bugdroid1@chromium.org, Jun 8 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal.git/+/8260f1031468e605cfc3ab44eb250e8949bd0166

commit 8260f1031468e605cfc3ab44eb250e8949bd0166
Author: dnj <dnj@google.com>
Date: Wed Jun 08 21:24:36 2016

Project Member

Comment 19 by bugdroid1@chromium.org, Jun 8 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal.git/+/b6a34d061f670a592fc2787a1367a8223cc43c7c

commit b6a34d061f670a592fc2787a1367a8223cc43c7c
Author: dnj <dnj@google.com>
Date: Wed Jun 08 21:58:52 2016

Project Member

Comment 20 by bugdroid1@chromium.org, Jun 9 2016

Labels: merge-merged-release-R52-8350.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/a2474688d3ae6a16f5dda56cf9616884c6843d06

commit a2474688d3ae6a16f5dda56cf9616884c6843d06
Author: Don Garrett <dgarrett@google.com>
Date: Fri May 27 21:53:31 2016

cbuildbot: Add --repo-cache option.

Give the builders an option to give us a warm repo cache to copy in if
we need to create our build root from scratch. This allows us to avoid
overloading GoB when we bring up large numbers of new builders.

BUG= chromium:609886 
TEST=run_tests

Change-Id: Ib79f219554b4a8e0e256f366916c376d716025d0
Reviewed-on: https://chromium-review.googlesource.com/348011
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>
(cherry picked from commit 0af6ba52f3943e09ee84eb90954ffbf4ddcd086b)
Reviewed-on: https://chromium-review.googlesource.com/351260
Commit-Queue: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Josafat Garcia <josafat@chromium.org>

[modify] https://crrev.com/a2474688d3ae6a16f5dda56cf9616884c6843d06/scripts/cbuildbot.py
[modify] https://crrev.com/a2474688d3ae6a16f5dda56cf9616884c6843d06/cbuildbot/stages/sync_stages.py

Project Member

Comment 21 by bugdroid1@chromium.org, Jun 9 2016

Labels: merge-merged-stabilize-8350.21.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/04ff68e574a453a5b5ca1db16a08b5e530147189

commit 04ff68e574a453a5b5ca1db16a08b5e530147189
Author: Don Garrett <dgarrett@google.com>
Date: Fri May 27 21:53:31 2016

cbuildbot: Add --repo-cache option.

Give the builders an option to give us a warm repo cache to copy in if
we need to create our build root from scratch. This allows us to avoid
overloading GoB when we bring up large numbers of new builders.

BUG= chromium:609886 
TEST=run_tests

Change-Id: Ib79f219554b4a8e0e256f366916c376d716025d0
Reviewed-on: https://chromium-review.googlesource.com/348011
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>
(cherry picked from commit 0af6ba52f3943e09ee84eb90954ffbf4ddcd086b)
Reviewed-on: https://chromium-review.googlesource.com/351261
Reviewed-by: Josafat Garcia <josafat@chromium.org>
Commit-Queue: Josafat Garcia <josafat@chromium.org>
Tested-by: Josafat Garcia <josafat@chromium.org>

[modify] https://crrev.com/04ff68e574a453a5b5ca1db16a08b5e530147189/scripts/cbuildbot.py
[modify] https://crrev.com/04ff68e574a453a5b5ca1db16a08b5e530147189/cbuildbot/stages/sync_stages.py

This should now be fixed.

I plan to re-create an idle TOT release builder this afternoon and examine the logs of it's next build to verify that things are working properly.
I used ccompute to destroy and re-create the image, but the cache was NOT populated in /var/cache/chrome-infra/ccompute-setup/cros-internal.

The currently expected behavior is that the builder will do a full sync on it's next build as if it had been clobbered.

So.... did I not wait long enough for the default chromeos image to have a populated cache entry?
Did you update the default image to chromeos-trusty-16060900-4018336fab7?

Comment 26 by dnj@google.com, Jun 9 2016

Clarifying #25, the builders will periodically create new images, but the image that's used by "ccompute" is encoded in the "ccompute" code base. That has to be updated to one of the images produced by the builders in order to actually use that image.
Ah, I thought the new images were automatically used. I did NOT update it.

Trying again. ;>
After updating the image used, the cache is there. Waiting for the scheduled build to night to see if it's used properly.
Project Member

Comment 29 by bugdroid1@chromium.org, Jun 11 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/0a58f977e98e783f1395a17ccbdaf01bf1f5127a

commit 0a58f977e98e783f1395a17ccbdaf01bf1f5127a
Author: Don Garrett <dgarrett@google.com>
Date: Fri Jun 10 23:15:22 2016

cbuildbot: Fix --repo-cache, again!

The tests for the --repo-cache logic didn't actually copy any files, and
so missed that I shutil.copytree fails if the target directory already
exists.

So... fix the tests to really copy stuff (no more mock), and then fix
the logic to only copy the .repo directory, which does not exist at the
target, but is enough to avoid GoB traffic.

BUG= chromium:609886 
TEST=Unittests

Change-Id: I99e830fedb1347984de275e197bd5c4b2c1bd757
Reviewed-on: https://chromium-review.googlesource.com/351650
Trybot-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>
Commit-Queue: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/0a58f977e98e783f1395a17ccbdaf01bf1f5127a/cbuildbot/stages/sync_stages_unittest.py
[modify] https://crrev.com/0a58f977e98e783f1395a17ccbdaf01bf1f5127a/cbuildbot/stages/sync_stages.py

Project Member

Comment 30 by bugdroid1@chromium.org, Jun 15 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/b24070c2424a849e31067eed81d422d1fd5f8c4d

commit b24070c2424a849e31067eed81d422d1fd5f8c4d
Author: Don Garrett <dgarrett@google.com>
Date: Mon Jun 13 21:51:39 2016

repo-cache: Handle symlinks.

It seems that shutil.copytree doesn't handle symlinks the way we want it
too, which causes yet more problems. Extend the unittests to reproduce
the failure, fix the bug, and use the unittests to validate the fix.

BUG= chromium:609886 
TEST=run_tests

Change-Id: Iec2fe7515013f5e47cadf88a17a93d66cc0701ed
Reviewed-on: https://chromium-review.googlesource.com/352241
Commit-Ready: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/b24070c2424a849e31067eed81d422d1fd5f8c4d/cbuildbot/stages/sync_stages_unittest.py
[modify] https://crrev.com/b24070c2424a849e31067eed81d422d1fd5f8c4d/cbuildbot/stages/sync_stages.py

Okay, this finally seems to work. The logs on this sample builder show that the warm cache was used, and the total sync time was about 6 minutes, which is inline with sync time for a builder with an existing checkout.

https://uberchromegw.corp.google.com/i/chromeos/builders/beaglebone-release/builds/110/steps/ManifestVersionedSync/logs/stdio

@@@BUILD_STEP@ManifestVersionedSync@@@
************************************************************
** Start Stage ManifestVersionedSync - Wed, 15 Jun 2016 02:06:46 -0700 (PDT)
** 
** Stage that generates a unique manifest file, and sync's to it.
************************************************************
INFO:root:Using warm cache "/var/cache/chrome-infra/ccompute-setup/cros-internal" to populate buildroot "/b/cbuild/internal_master"
02:06:46: INFO: Using warm cache "/var/cache/chrome-infra/ccompute-setup/cros-internal" to populate buildroot "/b/cbuild/internal_master"
INFO:root:Running cidb query on pid 19378, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb2ee2bf6d0; Select object>

Status: Fixed (was: Started)
Project Member

Comment 33 by bugdroid1@chromium.org, Jun 15 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal.git/+/7c2c9d94f188e8b804426952ff5a4984f8bb85d6

commit 7c2c9d94f188e8b804426952ff5a4984f8bb85d6
Author: dnj <dnj@google.com>
Date: Wed Jun 15 21:03:51 2016

Project Member

Comment 34 by bugdroid1@chromium.org, Jun 17 2016

Project Member

Comment 35 by bugdroid1@chromium.org, Jun 17 2016

Closing... please feel free to reopen if its not fixed.
Status: Verified (was: Fixed)
Project Member

Comment 38 by bugdroid1@chromium.org, Jul 8 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/29ccc32daffd412426e4a7392e82d0bd8eb74b26

commit 29ccc32daffd412426e4a7392e82d0bd8eb74b26
Author: Don Garrett <dgarrett@google.com>
Date: Fri Jun 10 23:15:22 2016

cbuildbot: Fix --repo-cache, again!

The tests for the --repo-cache logic didn't actually copy any files, and
so missed that I shutil.copytree fails if the target directory already
exists.

So... fix the tests to really copy stuff (no more mock), and then fix
the logic to only copy the .repo directory, which does not exist at the
target, but is enough to avoid GoB traffic.

BUG= chromium:609886 
TEST=Unittests

Change-Id: I99e830fedb1347984de275e197bd5c4b2c1bd757
Previous-Reviewed-on: https://chromium-review.googlesource.com/351650
(cherry picked from commit 4ce156fd77485fc6e52432a552174f254580688a)
Reviewed-on: https://chromium-review.googlesource.com/359004
Reviewed-by: Don Garrett <dgarrett@chromium.org>
Commit-Queue: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/29ccc32daffd412426e4a7392e82d0bd8eb74b26/cbuildbot/stages/sync_stages_unittest.py
[modify] https://crrev.com/29ccc32daffd412426e4a7392e82d0bd8eb74b26/cbuildbot/stages/sync_stages.py

Project Member

Comment 39 by bugdroid1@chromium.org, Jul 8 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/a18878074e99d8b258308356fac9205d3e9e8e30

commit a18878074e99d8b258308356fac9205d3e9e8e30
Author: Don Garrett <dgarrett@google.com>
Date: Mon Jun 13 21:51:39 2016

repo-cache: Handle symlinks.

It seems that shutil.copytree doesn't handle symlinks the way we want it
too, which causes yet more problems. Extend the unittests to reproduce
the failure, fix the bug, and use the unittests to validate the fix.

BUG= chromium:609886 
TEST=run_tests

Change-Id: Iec2fe7515013f5e47cadf88a17a93d66cc0701ed
Previous-Reviewed-on: https://chromium-review.googlesource.com/352241
(cherry picked from commit f1578cacc1ca7587f9f3bbc2ffccd872aa7d8412)
Reviewed-on: https://chromium-review.googlesource.com/359089
Reviewed-by: Don Garrett <dgarrett@chromium.org>
Commit-Queue: Don Garrett <dgarrett@chromium.org>
Tested-by: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/a18878074e99d8b258308356fac9205d3e9e8e30/cbuildbot/stages/sync_stages_unittest.py
[modify] https://crrev.com/a18878074e99d8b258308356fac9205d3e9e8e30/cbuildbot/stages/sync_stages.py

Sign in to add a comment