Seed cold repo cache from preload |
||||
Issue descriptionThe CrOS GCE builder image now has /preload/chromeos seeded with a snapshot of the source tree at build time. We should use this when the named cache is empty. Its awkward (and possibly undesirable) to do this in the recipe itself; perhaps better to do in bot_config.py?
,
Dec 10
Just a few notes based on some of the reasoning behind the current implementation. We couldn't automatically stage the checkout to cache as it could directly impact swarming, therefore we're staging to the /preload directory and instructing the repo library to check for its existence on initial execution. It would be ideal to have this step take place while the bot updates via the swarming implementation; essentially before the bot ever begins handling tasks. The copy of the checkout still takes between 6-15 minutes (depending on the underlying throughput as it varies wildly). It is still a 75% gain but ideally we could eliminate some of the variation and overhead. The use of overlayfs w/bind mount may aide in speeding up this process but we cannot utilize the checkout as rw; corruption of the checkout/cache is common thus we'd lose 40 minutes in the event of corruption. I did not invest any time in overlayfs therefore options here are wide open. I did test some alternatives to the standard copy, but none provided the speed improvements that would justify the change in implementation. -- Mike
,
Dec 10
The source (and SDK?) will be checked synced each run in an OverlayFS RW layer, which is thrown away every time, right? Why don't we just use this cache dir as the base layer in the OverlayFS (aka no need to copy it)? Only question then is, what happens when inevitably a bot is missing the cache and we re-pull the entire repo from GOB during the sync every run...
,
Dec 10
We should already be using the /preload directory if it's present https://crrev.com/c/1338510 There are more things we could do to improve performance, but this is probably all we should try to do until we are ready to start using overlayfs, and doing that will involve significantly more management planning.
,
Dec 10
Sorry Mike, race condition in posting there, same question. Another meta question: how much effort do we want to put into all of this? We are re-creating a bespoke Docker here. Swarming bots already have Docker, do we want to look into letting it manage all the overlayFS / GC junk for us this cycle?
,
Dec 10
Don: This is for the recipes implementation.
,
Dec 10
,
Dec 10
Hey Alec, a few comments. In regards to comment #3, we'd hope to avoid ever having a bot missing the cache as it is part of the image creation; the image fails to create if the checkout fails therefore we only have to fear our own process corrupting or removing the cache. Today we do not have the GOB quota to handle the bots checking out therefore we either preload it or generate a nightly checkout and store it in GS. In regards to comment #5, I would go so far as stating that the current solution is far from perfect. It is a solution that fit within the current structure without introducing to many variables. That said, I'm not sure where Docker fits into the discussion or how Swarming's implementation potential solves the problem. There are a wide range of solutions to ease this process, which could involve Docker, therefore without knowing how the recipe implementation is planning on proceeding, I'm afraid I can't provide much in terms of the Docker solution. I do feel this is a problem that should be kept relatively simple; faster is better, obviously, but complexity can quickly outweigh those gains. -- Mike |
||||
►
Sign in to add a comment |
||||
Comment 1 by jclinton@chromium.org
, Dec 10