ChromeOS Swarming builds have no access to system packages |
||||||||||||||||
Issue descriptionThe ChromeOS swarming builders are getting closer to production use, but need the same packages and credentials as buildbot ChromeOS GCE builders before that can happen. These builders have hostnames of the form: swarm-cros-0 In this sample build: https://luci-milo.appspot.com/p/chromeos/builds/b8963355179379851840 We can see that the ts_mon package isn't installed, and that the package sqlalchemy isn't installed. During initial testing, this wasn't important, but is becoming more critical now.
,
Nov 10 2017
Actually, after logging in to a server and poking around, something else is going on that I don't understand. cbuildbot is complaining that it can't import sqlalchemy, but I can do it, if I run python by hand. Could this be related to a python environment setup by swarming?
,
Nov 10 2017
When run through swarming, cbuildbot was unable to setup ts_mon, or import sqlalchemy. I assumed that meant they weren't present, and so thought it was a puppet bug, since I'd just reinstanced the builders. Actually, they exist as system packages, and the build works if run by hand, but fails when run via swarming. I believe we are being sandboxed more effectively than we used to be.
,
Nov 10 2017
Note, I also made a shift to the recipe right before this started, so it could equally well be to blame. CL:*502499 should let me do builds with either recipe to see if that's the issue.
,
Nov 11 2017
https://docs.google.com/document/d/1dGXRvz1QJh-tNWppG5xElckG25ag0rQw7JJ5F8zOKyc/edit#heading=h.f8j28w1nevn section is for this bug basically any python script that has dependencies besides stdlib, must be declare them explicitly and must be executed using vpython instead of python. It seems that chromite is executed via python? https://cs.chromium.org/chromium/src/third_party/chromite/scripts/wrapper.py?q=wrapper.py+python&dr=C&l=1 see how chromium declares its deps: https://chromium.googlesource.com/chromium/src/+/master/.vpython vpython docs: go/vpython
,
Nov 11 2017
A) Is this intentionally not enforced on buildbot? B) When did this start being enforced for the swarming builders? Was there no announcement? C) This is a HUGE change (easily months of work) We invoke a large number of external scripts, all of which would need conversion, and some of which have conflicting package requirements, and many of which run a in a wide variety of environments, some of them without depot tools, CIPD, or network access (on a test DUT for example). We've looked at vpython in the past, and decided not to use it for these reasons.
,
Nov 11 2017
,
Nov 11 2017
Nodir, can we turn this off for ChromeOS? Maybe they can install packages on bots for now?
,
Nov 11 2017
FWIU this is happening because $PATH/python for LUCI build process tree is not system python, but
/b/swarming/w/ir/cipd_bin_packages/bin/python
which is installed from CIPD package infra/python/cpython/${platform}
which is built by recipe
https://cs.chromium.org/chromium/infra/recipes/recipe_modules/third_party_packages
which took a lot of Dan's time to implement.
I will need to think a bit how to solve this problem without introducing too much tech debt for us.
,
Nov 11 2017
,
Nov 11 2017
to be clear, this is configured in a global config https://chrome-internal.googlesource.com/infradata/config/+/master/configs/cr-buildbucket/swarming_task_template.json so I will need to think how to disable it only for CrOS without introducing too much code and increasing API surface
,
Nov 13 2017
Thanks! I will say that Dan was encouraging us to use vpython, but was aware that we weren't, and why.
,
Nov 13 2017
Would it be sufficient for the recipe to set the PATH when invoking cbuildbot_launch? All of the scripts under our control are run directly or indirectly by it, and there is no reason not to use vpython for recipe purposes.
,
Nov 13 2017
,
Nov 13 2017
A couple things:
* We're actively working to deprecate installation of python packages via puppet directives. This functionality will go away, hopefully mid-year next year. The reason for this is so that tasks can be expressed hermetically/reproducibly, and so we can make the underlying system configuration homogenous.
* vpython is the supported solution to the problem of preparing groups of python dependencies for scripts.
* vpython is currently in the place where it knows how to stand-in for "python" in $PATH, which would allow scripts invoking 'python' to transparently get vpython instead. I'd be happy to work with you to make sure this is W.A.I.
That would be the supported path; symlinking vpython -> python and then ensuring that there's a vpython spec that your scripts can use. I think that we can add this as a feature of kitchen (and then make this the only behavior eventually). I'll work on doing this today.
If you want to venture down the unsupported path (which will likely break/degrade without notice), recipes do allow you to override environment variables (e.g. PATH) when running steps:
with api.context.env(PATH="override entire $PATH variable"):
api.step(...) # will override $PATH
OR
with api.context.env_prefixes(PATH=["path to prepend to $PATH"]):
api.step(...)
Which would be fine as a temporary stepping stone towards actually fixing task hermeticity, but if it breaks it would be on you to find a workaround (or to switch to vpython).
Bulidbot never enforced any kind of sandboxing, but swarmbucket has always attempted to do so (to varying degrees of effectiveness), in order to make sure that LUCI builds are actually maintainable vs. buildbot. Kitchen/recipes have sandboxed python execution on swarming for quite a while now, and fixing task hermeticity is one of the todos when switching from buildbot recipes -> swarming recipes.
,
Nov 13 2017
,
Nov 14 2017
,
Nov 14 2017
As an alternative, can you simply check in the python packages into a repo somewhere and DEPS them in (and then include them in the PYTHONPATH)? That is a perfectly reasonable alternative to using vpython everywhere.
,
Nov 14 2017
dgarret, yeah it might be iannucci wants to pursue uniform usage of vpython everywhere. Do I understand correctly that if * $PATH/python is actually vpython * swarming task process has env variables that specifies the list deps that CrOS needs Then this problem is solved
,
Nov 14 2017
We're already about 6 months behind the initial time estimates for shutting down a waterfall, and it's time to prove if it really works for us or not. I don't want to be blocked on anything new. I started to put on my engineer hat and point out why #18,#19 might/might not work (I don't know much about DEPS, so hard to be sure), or suggest different solutions (like an LXC container). I'm totally willing to accept vpython cleanup as future work, and even set a bounding date for completion. But I don't want to block swarming builds for ChromeOS on it. And I don't think it's reasonable to accept new requirements like this with no warning. akeshet and dpranke have a meeting later today, I'm hoping they talk about how to communicate this kind of change so it doesn't come as a surprise after the fact. I suspect there will be plenty of similar issues in the future, such as the fact that we use sudo, or the way credentials are distributed to builders.
,
Nov 14 2017
(aside: if your python packages are pure-python, then yes, you can also vendor/DEPS them in)
,
Nov 14 2017
Like I said: you can completely override PATH in the recipes. But it's definitely a break-glass situation.
,
Nov 14 2017
Can we get support for it for a limited time frame? IE: Until we have a cleaner solution in place? We already modify the PATH to use a pinned version of depot_tools (I think dnj@ suggested that originally). https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromite/api.py?rcl=c44ab2763b7843f7c9fbf55f2ef9a7a365ce0a4d&l=326
,
Nov 14 2017
To be clear: we should do whatever we need to do to not get blocked on moving to vpython. There is a longer-term conversation about vpython, containers, other sorts of hermeticity, etc., but that's not this. Whether or not you call this a "break-glass" solution is up to you ...
,
Nov 15 2017
I think somehow I've confused things here. Let me try to restate my understanding of the situation and the options. My understanding of the situation is that, more-or-less, the chromeos recipes depend on a few python packages (e.g., ts_mon, sqlalchemy). That is fine. There are (at least) two copies of python on the system, the system python, and the hermetic copy that swarming uses by default. That is also fine. The CrOS folks (may?) have asked to get the needed packages installed onto their bots via Puppet; however, that would by default only make them available to the system python, and not the swarming python. In #c15, iannucci@ suggests that we can modify the recipe to override the PATH env var to explicitly pick up the system python for these scripts, instead of the swarming python. Alternatively, we can override the PYTHONPATH env var to still use the swarming python, but pick up the system packages. In #c18, I attempted, in a confusing way, to suggest that a third option would be to DEPS in the needed python packages, and just pick them up in your scripts so that you don't need to worry about picking up the system packages at all. However, we would only suggest you do that if they were pure-python; dealing with binary packages that way would be too much of a hassle. So, any of those three options are fine and we'll support them, more-or-less-happily :). If possible, the DEPS version is probably best, since that way you don't need to worry about puppet configs, and you have more control over things. But, if that's not possible, or if you'd prefer to use the system packages now to unblock things, and look into moving to DEPS later, that's fine, too. Hopefully that clarifies things; if not let me know. dgarrett@, I'll bounce this to you for now?
,
Nov 15 2017
PYTHONPATH won't work with vpython. This is intended. given a/__init__.py b/b.py this works: PYTHONPATH=. python b/b.py this fails: PYTHONPATH=. vpython b/b.py
,
Nov 15 2017
How exactly do DEPS work? What ENVs or file locations do they use? We play some funny games that could easily break.
,
Nov 15 2017
DEPS approach with PYTHONPATH won't work, see c#26 The options I see: 1) use system python 2) iannucci finishes his work (almost done) to make $PATH/python the vpython and add .vpython file to chromeos repos that specifies the list of deps
,
Nov 15 2017
To use 2, I really need to understand how DEPS work, and where they would be stored. We play some very funny games with both ENVs and files. This includes deleting things rather freely from some locations, creating root owned files, modifying paths with a mix of loopback and bind mounted paths, and filtering non-whitelisted ENVs both when using sudo, and when entering the chroot. My fears are that we wipe the DEPs during some cleanup step, or modify the path at which they might be found, causing confusion. There are a number of points during which they might not be able to be recreated. A recent example was during ebuild emerges, which restrict file writes to their working directories to make sure that compiles don't modify the host system.
,
Nov 16 2017
Let's get unblocked, do #1, use system python. (other options have long term promise, but clearly require investigation). Bouncing back to nodir@. Sounds like in #9 you may have had some idea how to accomplish this (recipe change?).
,
Nov 16 2017
,
Nov 16 2017
I started this CL to produce a new recipe for ChromeOS swarming builds, but it's blocked on a couple of steps because I don't see how to use config_lib.Paths very well. https://chromium-review.googlesource.com/c/chromium/tools/build/+/770496
,
Nov 16 2017
yes, dgarret is on the right path, https://chromium-review.googlesource.com/c/chromium/tools/build/+/770496 is what i had in mind
,
Nov 16 2017
I think we've achieved a consensus and dgarret@ started the actual work https://chromium-review.googlesource.com/c/chromium/tools/build/+/770496
,
Nov 17 2017
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/recipes-py/+/ee90ad121c9e96935792ee2276fecc5139ddccc0 commit ee90ad121c9e96935792ee2276fecc5139ddccc0 Author: Don Garrett <dgarrett@google.com> Date: Fri Nov 17 21:52:33 2017 Create file/api.symlink. Add a new method for creating symlinks inside recipes. BUG= chromium:783517 Change-Id: I43bf6263fe2a6f1a270216d610032803e138af01 Reviewed-on: https://chromium-review.googlesource.com/776077 Commit-Queue: Don Garrett <dgarrett@chromium.org> Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Reviewed-by: Nodir Turakulov <nodir@chromium.org> [modify] https://crrev.com/ee90ad121c9e96935792ee2276fecc5139ddccc0/recipe_modules/file/resources/fileutil.py [modify] https://crrev.com/ee90ad121c9e96935792ee2276fecc5139ddccc0/README.recipes.md [add] https://crrev.com/ee90ad121c9e96935792ee2276fecc5139ddccc0/recipe_modules/file/examples/symlink.py [modify] https://crrev.com/ee90ad121c9e96935792ee2276fecc5139ddccc0/recipe_modules/file/api.py [add] https://crrev.com/ee90ad121c9e96935792ee2276fecc5139ddccc0/recipe_modules/file/examples/symlink.expected/basic.json
,
Dec 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/2790c36b4926e9f230fc27300ff90124d54d6f58 commit 2790c36b4926e9f230fc27300ff90124d54d6f58 Author: Don Garrett <dgarrett@google.com> Date: Tue Dec 05 02:27:20 2017 Create cros/swarming recipe. Create a new recipe for ChromeOS swarming builds. Recipe ignores per-waterfall configuration, and instead uses only buildbucket properies to build the cbuildbot_launch command line. It also creates a symlink to /usr/bin/python in a new directory which is inserted into the beginning of PATH as a way to run cbuildbot without vpython (for now). BUG= chromium:783517 Change-Id: I87e71db2f24e436872635f8da16c413e08f78f95 Reviewed-on: https://chromium-review.googlesource.com/770496 Reviewed-by: Nodir Turakulov <nodir@chromium.org> Commit-Queue: Don Garrett <dgarrett@chromium.org> [add] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/swarming.py [add] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/swarming.expected/swarming_builder.json [modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipe_modules/chromite/api.py [modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/cbuildbot.expected/chromiumos_paladin.json [modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/cbuildbot_tryjob.py [modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/README.recipes.md [modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipe_modules/chromite/__init__.py [modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/cbuildbot.expected/chromiumos_paladin_manifest_failure.json
,
Dec 5 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/manifest-internal/+/63f7c0d0d113e974723497ef54fe8ac5603042be commit 63f7c0d0d113e974723497ef54fe8ac5603042be Author: Don Garrett <dgarrett@google.com> Date: Tue Dec 05 02:32:04 2017
,
Dec 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/0b611752f13ea4c8339ed336151fd1e10cfc9808 commit 0b611752f13ea4c8339ed336151fd1e10cfc9808 Author: Don Garrett <dgarrett@google.com> Date: Wed Dec 06 21:37:55 2017 cros/swarming recipe: Symlink for python2. We previously added a special symlink to let ChromeOS swarming builds use the system python binary, but most of our scripts actually use python2. So... add a python2 to the bypass. BUG= 783517 Change-Id: I6a69666496cfaa9a515fe85c3ced17c8aa3cd54e Reviewed-on: https://chromium-review.googlesource.com/811429 Reviewed-by: Nodir Turakulov <nodir@chromium.org> Commit-Queue: Don Garrett <dgarrett@chromium.org> [modify] https://crrev.com/0b611752f13ea4c8339ed336151fd1e10cfc9808/scripts/slave/README.recipes.md [modify] https://crrev.com/0b611752f13ea4c8339ed336151fd1e10cfc9808/scripts/slave/recipes/cros/swarming.expected/swarming_builder.json [modify] https://crrev.com/0b611752f13ea4c8339ed336151fd1e10cfc9808/scripts/slave/recipe_modules/chromite/api.py
,
Dec 11 2017
The latest updates still aren't working, and I'm not yet sure why: https://luci-milo.appspot.com/p/chromeos/builds/b8960845419119264192 VPYTHON_VIRTUALENV_ROOT: /b/swarming/w/ir/cache/vpython /b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch: could not import chromite module: /b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch.py: No module named contrib Traceback (most recent call last): File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch", line 169, in <module> DoMain() File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch", line 165, in DoMain commandline.ScriptWrapperMain(FindTarget) File "/b/swarming/w/ir/kitchen-workdir/chromite/lib/commandline.py", line 890, in ScriptWrapperMain target = find_target_func(target) File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch", line 140, in FindTarget module = cros_import.ImportModule(target) File "/b/swarming/w/ir/kitchen-workdir/chromite/lib/cros_import.py", line 44, in ImportModule module = __import__(target) File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch.py", line 23, in <module> from chromite.cbuildbot.stages import sync_stages File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/stages/sync_stages.py", line 23, in <module> from chromite.cbuildbot import lkgm_manager File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/lkgm_manager.py", line 19, in <module> from chromite.cbuildbot import manifest_version File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/manifest_version.py", line 19, in <module> from chromite.cbuildbot import build_status File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/build_status.py", line 13, in <module> from chromite.cbuildbot import relevant_changes File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/relevant_changes.py", line 12, in <module> from chromite.cbuildbot.stages import artifact_stages File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/stages/artifact_stages.py", line 17, in <module> from chromite.cbuildbot import commands File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/commands.py", line 30, in <module> from chromite.lib import gob_util File "/b/swarming/w/ir/kitchen-workdir/chromite/lib/gob_util.py", line 33, in <module> from oauth2client.contrib import gce ImportError: No module named contrib step returned non-zero exit code: 1
,
Dec 11 2017
,
Dec 11 2017
Comparing https://github.com/google/oauth2client/tree/v1.5.2/oauth2client and https://github.com/google/oauth2client/tree/v2.0.0/oauth2client implies that the calling code expects at least v2 (current version is v4.1.2) but the version available is v1.5.2 or lower.
,
Dec 11 2017
Available from where? "chromite/third_party/oauth2client" is part of the chromite checkout, and should have been available for import via the path manipulation done in chromite/scripts/wrapper.py. I don't yet understand why/how that would be broken.
,
Dec 11 2017
Most likely import "from oauth2client import gce" failed with ImportError due to some dependency of https://cs.chromium.org/chromium/src/third_party/chromite/third_party/oauth2client/gce.py being missing, and gob_util.py then fell back to 'contrib' and failed there too. If I have to guess, my guess is 'six' is missing.
,
Dec 12 2017
For debugging purposes, I can log into the builder, clone chromite, and run "cbuildbot_launch --help" without error. That means that everything necessary exists on the builder, when not inside the recipe environment.
,
Dec 13 2017
Although the recipe inserts system path into $PATH, I see that recipe engine still prepends a dir with non-system python: PATH: /b/swarming/w/ir/cipd_bin_packages:/b/swarming/w/ir/cipd_bin_packages/bin:/b/swarming/w/ir/kitchen-workdir/depot_tools:/b/swarming/w/ir/kitchen-workdir/python_bin:/b/swarming/w/ir/cipd_bin_packages:/b/swarming/w/ir/cipd_bin_packages/bin:/b/swarming/cipd_cache/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin note first two entries. FWIU this intrusive behavior was removed in https://chromium-review.googlesource.com/c/infra/infra/+/809624/4/go/src/infra/tools/kitchen/cook.go but wasn't rolled into production yet. FWIU iannucci intends to do it very soon.
,
Dec 13 2017
,
Dec 13 2017
Robbie's change has hit production, and swarming builds again work for ChromeOS. https://luci-milo.appspot.com/p/chromeos/builds/b8960289718183795968
,
Dec 13 2017
Thanks!
,
Dec 13 2017
Awesome, great to hear. |
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by dgarr...@chromium.org
, Nov 10 2017