New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 783517 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 784597



Sign in to add a comment

ChromeOS Swarming builds have no access to system packages

Project Member Reported by dgarr...@chromium.org, Nov 10 2017

Issue description

The ChromeOS swarming builders are getting closer to production use, but need the same packages and credentials as buildbot ChromeOS GCE builders before that can happen.

These builders have hostnames of the form: swarm-cros-0

In this sample build:
  https://luci-milo.appspot.com/p/chromeos/builds/b8963355179379851840

We can see that the ts_mon package isn't installed, and that the package sqlalchemy isn't installed. During initial testing, this wasn't important, but is becoming more critical now.

 
Those builders were working, but broke when I reinstanced them.

I'm presuming that there has been some regression around Puppet configuration. Since they've been around for a long time without being reinstanced, this could have been months ago.

I'm looking into the Puppet configs myself, but am deeply ignorant about how they work.
Actually, after logging in to a server and poking around, something else is going on that I don't understand.

cbuildbot is complaining that it can't import sqlalchemy, but I can do it, if I run python by hand.

Could this be related to a python environment setup by swarming?
Cc: no...@chromium.org mar...@chromium.org hinoka@chromium.org
Summary: ChromeOS Swarming builds have no access to system packages (was: ChromeOS Swarming Bots need puppet love.)
When run through swarming, cbuildbot was unable to setup ts_mon, or import sqlalchemy.

I assumed that meant they weren't present, and so thought it was a puppet bug, since I'd just reinstanced the builders.

Actually, they exist as system packages, and the build works if run by hand, but fails when run via swarming. I believe we are being sandboxed more effectively than we used to be.

Labels: -Pri-3 Pri-1
Note, I also made a shift to the recipe right before this started, so it could equally well be to blame.

CL:*502499 should let me do builds with either recipe to see if that's the issue.

Comment 5 by no...@chromium.org, Nov 11 2017

https://docs.google.com/document/d/1dGXRvz1QJh-tNWppG5xElckG25ag0rQw7JJ5F8zOKyc/edit#heading=h.f8j28w1nevn
section is for this bug

basically any python script that has dependencies besides stdlib, must be declare them explicitly and must be executed using vpython instead of python. It seems that chromite is executed via python? 
https://cs.chromium.org/chromium/src/third_party/chromite/scripts/wrapper.py?q=wrapper.py+python&dr=C&l=1

see how chromium declares its deps:
https://chromium.googlesource.com/chromium/src/+/master/.vpython


vpython docs: go/vpython
A) Is this intentionally not enforced on buildbot?
B) When did this start being enforced for the swarming builders? Was there no announcement?
C) This is a HUGE change (easily months of work)

We invoke a large number of external scripts, all of which would need conversion, and some of which have conflicting package requirements, and many of which run a in a wide variety of environments, some of them without depot tools, CIPD, or network access (on a test DUT for example).

We've looked at vpython in the past, and decided not to use it for these reasons.


Cc: akes...@chromium.org ayatane@chromium.org

Comment 8 by estaab@chromium.org, Nov 11 2017

Components: -Infra Infra>Platform
Nodir, can we turn this off for ChromeOS? Maybe they can install packages on bots for now?

Comment 9 by no...@chromium.org, Nov 11 2017

Cc: d...@chromium.org
FWIU this is happening because $PATH/python for LUCI build process tree is not system python, but
/b/swarming/w/ir/cipd_bin_packages/bin/python
which is installed from CIPD package infra/python/cpython/${platform}
which is built by recipe
https://cs.chromium.org/chromium/infra/recipes/recipe_modules/third_party_packages
which took a lot of Dan's time to implement.

I will need to think a bit how to solve this problem without introducing too much tech debt for us.


Comment 10 by no...@chromium.org, Nov 11 2017

Owner: no...@chromium.org
Status: Assigned (was: Untriaged)

Comment 11 by no...@chromium.org, Nov 11 2017

to be clear, this is configured in a global config https://chrome-internal.googlesource.com/infradata/config/+/master/configs/cr-buildbucket/swarming_task_template.json so I will need to think how to disable it only for CrOS without introducing too much code and increasing API surface
Thanks!

I will say that Dan was encouraging us to use vpython, but was aware that we weren't, and why.

Would it be sufficient for the recipe to set the PATH when invoking cbuildbot_launch?

All of the scripts under our control are run directly or indirectly by it, and there is no reason not to use vpython for recipe purposes.

Comment 14 by no...@chromium.org, Nov 13 2017

Cc: iannucci@chromium.org
A couple things:
  * We're actively working to deprecate installation of python packages via puppet directives. This functionality will go away, hopefully mid-year next year. The reason for this is so that tasks can be expressed hermetically/reproducibly, and so we can make the underlying system configuration homogenous.
  * vpython is the supported solution to the problem of preparing groups of python dependencies for scripts.
  * vpython is currently in the place where it knows how to stand-in for "python" in $PATH, which would allow scripts invoking 'python' to transparently get vpython instead. I'd be happy to work with you to make sure this is W.A.I.

That would be the supported path; symlinking vpython -> python and then ensuring that there's a vpython spec that your scripts can use. I think that we can add this as a feature of kitchen (and then make this the only behavior eventually). I'll work on doing this today.

If you want to venture down the unsupported path (which will likely break/degrade without notice), recipes do allow you to override environment variables (e.g. PATH) when running steps:

  with api.context.env(PATH="override entire $PATH variable"):
    api.step(...)  # will override $PATH

  OR
  
  with api.context.env_prefixes(PATH=["path to prepend to $PATH"]):
    api.step(...)

Which would be fine as a temporary stepping stone towards actually fixing task hermeticity, but if it breaks it would be on you to find a workaround (or to switch to vpython).

Bulidbot never enforced any kind of sandboxing, but swarmbucket has always attempted to do so (to varying degrees of effectiveness), in order to make sure that LUCI builds are actually maintainable vs. buildbot. Kitchen/recipes have sandboxed python execution on swarming for quite a while now, and fixing task hermeticity is one of the todos when switching from buildbot recipes -> swarming recipes.
Blockedon: 784597
Cc: dpranke@chromium.org
As an alternative, can you simply check in the python packages into a repo somewhere and DEPS them in (and then include them in the PYTHONPATH)? That is a perfectly reasonable alternative to using vpython everywhere.

Comment 19 by no...@chromium.org, Nov 14 2017

dgarret, yeah it might be

iannucci wants to pursue uniform usage of vpython everywhere. Do I understand correctly that if

* $PATH/python is actually vpython
* swarming task process has env variables that specifies the list deps that CrOS needs

Then this problem is solved

We're already about 6 months behind the initial time estimates for shutting down a waterfall, and it's time to prove if it really works for us or not. I don't want to be blocked on anything new.

I started to put on my engineer hat and point out why #18,#19 might/might not work (I don't know much about DEPS, so hard to be sure), or suggest different solutions (like an LXC container). I'm totally willing to accept vpython cleanup as future work, and even set a bounding date for completion.

But I don't want to block swarming builds for ChromeOS on it. And I don't think it's reasonable to accept new requirements like this with no warning.

akeshet and dpranke have a meeting later today, I'm hoping they talk about how to communicate this kind of change so it doesn't come as a surprise after the fact.

I suspect there will be plenty of similar issues in the future, such as the fact that we use sudo, or the way credentials are distributed to builders.

(aside: if your python packages are pure-python, then yes, you can also vendor/DEPS them in)
Like I said: you can completely override PATH in the recipes. But it's definitely a break-glass situation.
Can we get support for it for a limited time frame? IE: Until we have a cleaner solution in place?

We already modify the PATH to use a pinned version of depot_tools (I think dnj@ suggested that originally).

https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/chromite/api.py?rcl=c44ab2763b7843f7c9fbf55f2ef9a7a365ce0a4d&l=326
To be clear: we should do whatever we need to do to not get blocked on moving to vpython. There is a longer-term conversation about vpython, containers, other sorts of hermeticity, etc., but that's not this.

Whether or not you call this a "break-glass" solution is up to you ...
Owner: dgarr...@chromium.org
I think somehow I've confused things here. Let me try to restate my understanding of the situation and the options.

My understanding of the situation is that, more-or-less, the chromeos recipes depend on a few python packages (e.g., ts_mon, sqlalchemy). That is fine.

There are (at least) two copies of python on the system, the system python, and the hermetic copy that swarming uses by default. That is also fine.

The CrOS folks (may?) have asked to get the needed packages installed onto their bots via Puppet; however, that would by default only make them available to the system python, and not the swarming python.

In #c15, iannucci@ suggests that we can modify the recipe to override the PATH env var to explicitly pick up the system python for these scripts, instead of the swarming python. Alternatively, we can override the PYTHONPATH env var to still use the swarming python, but pick up the system packages.

In #c18, I attempted, in a confusing way, to suggest that a third option would be to DEPS in the needed python packages, and just pick them up in your scripts so that you don't need to worry about picking up the system packages at all. However, we would only suggest you do that if they were pure-python; dealing with binary packages that way would be too much of a hassle.

So, any of those three options are fine and we'll support them, more-or-less-happily :). If possible, the DEPS version is probably best, since that way you don't need to worry about puppet configs, and you have more control over things. But, if that's not possible, or if you'd prefer to use the system packages now to unblock things, and look into moving to DEPS later, that's fine, too.

Hopefully that clarifies things; if not let me know. dgarrett@, I'll bounce this to you for now?


Comment 26 by no...@chromium.org, Nov 15 2017

PYTHONPATH won't work with vpython. This is intended.

given
  a/__init__.py
  b/b.py

this works:
  PYTHONPATH=. python b/b.py
this fails:
  PYTHONPATH=. vpython b/b.py

How exactly do DEPS work?

What ENVs or file locations do they use? We play some funny games that could easily break.

Comment 28 by no...@chromium.org, Nov 15 2017

DEPS approach with PYTHONPATH won't work, see c#26
The options I see:
1) use system python
2) iannucci finishes his work (almost done) to make $PATH/python the vpython and add .vpython file to chromeos repos that specifies the list of deps


To use 2, I really need to understand how DEPS work, and where they would be stored.

We play some very funny games with both ENVs and files. This includes deleting things rather freely from some locations, creating root owned files, modifying paths with a mix of loopback and bind mounted paths, and filtering non-whitelisted ENVs both when using sudo, and when entering the chroot.

My fears are that we wipe the DEPs during some cleanup step, or modify the path at which they might be found, causing confusion.

There are a number of points during which they might not be able to be recreated. A recent example was during ebuild emerges, which restrict file writes to their working directories to make sure that compiles don't modify the host system.
Let's get unblocked, do #1, use system python. (other options have long term promise, but clearly require investigation).

Bouncing back to nodir@. Sounds like in #9 you may have had some idea how to accomplish this (recipe change?).
Owner: no...@chromium.org
I started this CL to produce a new recipe for ChromeOS swarming builds, but it's blocked on a couple of steps because I don't see how to use config_lib.Paths very well.

https://chromium-review.googlesource.com/c/chromium/tools/build/+/770496

Comment 33 by no...@chromium.org, Nov 16 2017

yes, dgarret is on the right path, https://chromium-review.googlesource.com/c/chromium/tools/build/+/770496 is what i had in mind

Comment 34 by no...@chromium.org, Nov 16 2017

Components: -Infra>Platform Infra>Client>ChromeOS
Owner: dgarr...@chromium.org
Status: Started (was: Assigned)
I think we've achieved a consensus and dgarret@ started the actual work https://chromium-review.googlesource.com/c/chromium/tools/build/+/770496
Project Member

Comment 36 by bugdroid1@chromium.org, Dec 5 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/2790c36b4926e9f230fc27300ff90124d54d6f58

commit 2790c36b4926e9f230fc27300ff90124d54d6f58
Author: Don Garrett <dgarrett@google.com>
Date: Tue Dec 05 02:27:20 2017

Create cros/swarming recipe.

Create a new recipe for ChromeOS swarming builds. Recipe ignores
per-waterfall configuration, and instead uses only buildbucket
properies to build the cbuildbot_launch command line.

It also creates a symlink to /usr/bin/python in a new directory which
is inserted into the beginning of PATH as a way to run cbuildbot
without vpython (for now).

BUG= chromium:783517 

Change-Id: I87e71db2f24e436872635f8da16c413e08f78f95
Reviewed-on: https://chromium-review.googlesource.com/770496
Reviewed-by: Nodir Turakulov <nodir@chromium.org>
Commit-Queue: Don Garrett <dgarrett@chromium.org>

[add] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/swarming.py
[add] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/swarming.expected/swarming_builder.json
[modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipe_modules/chromite/api.py
[modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/cbuildbot.expected/chromiumos_paladin.json
[modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/cbuildbot_tryjob.py
[modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/README.recipes.md
[modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipe_modules/chromite/__init__.py
[modify] https://crrev.com/2790c36b4926e9f230fc27300ff90124d54d6f58/scripts/slave/recipes/cros/cbuildbot.expected/chromiumos_paladin_manifest_failure.json

Project Member

Comment 37 by bugdroid1@chromium.org, Dec 5 2017

Labels: merge-merged-config
The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/manifest-internal/+/63f7c0d0d113e974723497ef54fe8ac5603042be

commit 63f7c0d0d113e974723497ef54fe8ac5603042be
Author: Don Garrett <dgarrett@google.com>
Date: Tue Dec 05 02:32:04 2017

Project Member

Comment 38 by bugdroid1@chromium.org, Dec 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/0b611752f13ea4c8339ed336151fd1e10cfc9808

commit 0b611752f13ea4c8339ed336151fd1e10cfc9808
Author: Don Garrett <dgarrett@google.com>
Date: Wed Dec 06 21:37:55 2017

cros/swarming recipe: Symlink for python2.

We previously added a special symlink to let ChromeOS swarming builds
use the system python binary, but most of our scripts actually use
python2. So... add a python2 to the bypass.

BUG= 783517 

Change-Id: I6a69666496cfaa9a515fe85c3ced17c8aa3cd54e
Reviewed-on: https://chromium-review.googlesource.com/811429
Reviewed-by: Nodir Turakulov <nodir@chromium.org>
Commit-Queue: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/0b611752f13ea4c8339ed336151fd1e10cfc9808/scripts/slave/README.recipes.md
[modify] https://crrev.com/0b611752f13ea4c8339ed336151fd1e10cfc9808/scripts/slave/recipes/cros/swarming.expected/swarming_builder.json
[modify] https://crrev.com/0b611752f13ea4c8339ed336151fd1e10cfc9808/scripts/slave/recipe_modules/chromite/api.py

The latest updates still aren't working, and I'm not yet sure why:

https://luci-milo.appspot.com/p/chromeos/builds/b8960845419119264192

VPYTHON_VIRTUALENV_ROOT: /b/swarming/w/ir/cache/vpython
/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch: could not import chromite module: /b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch.py: No module named contrib
Traceback (most recent call last):
  File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch", line 169, in <module>
    DoMain()
  File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch", line 165, in DoMain
    commandline.ScriptWrapperMain(FindTarget)
  File "/b/swarming/w/ir/kitchen-workdir/chromite/lib/commandline.py", line 890, in ScriptWrapperMain
    target = find_target_func(target)
  File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch", line 140, in FindTarget
    module = cros_import.ImportModule(target)
  File "/b/swarming/w/ir/kitchen-workdir/chromite/lib/cros_import.py", line 44, in ImportModule
    module = __import__(target)
  File "/b/swarming/w/ir/kitchen-workdir/chromite/scripts/cbuildbot_launch.py", line 23, in <module>
    from chromite.cbuildbot.stages import sync_stages
  File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/stages/sync_stages.py", line 23, in <module>
    from chromite.cbuildbot import lkgm_manager
  File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/lkgm_manager.py", line 19, in <module>
    from chromite.cbuildbot import manifest_version
  File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/manifest_version.py", line 19, in <module>
    from chromite.cbuildbot import build_status
  File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/build_status.py", line 13, in <module>
    from chromite.cbuildbot import relevant_changes
  File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/relevant_changes.py", line 12, in <module>
    from chromite.cbuildbot.stages import artifact_stages
  File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/stages/artifact_stages.py", line 17, in <module>
    from chromite.cbuildbot import commands
  File "/b/swarming/w/ir/kitchen-workdir/chromite/cbuildbot/commands.py", line 30, in <module>
    from chromite.lib import gob_util
  File "/b/swarming/w/ir/kitchen-workdir/chromite/lib/gob_util.py", line 33, in <module>
    from oauth2client.contrib import gce
ImportError: No module named contrib
step returned non-zero exit code: 1
Cc: bpastene@chromium.org tandrii@chromium.org
 Issue 792733  has been merged into this issue.
Comparing
https://github.com/google/oauth2client/tree/v1.5.2/oauth2client
and
https://github.com/google/oauth2client/tree/v2.0.0/oauth2client

implies that the calling code expects at least v2 (current version is v4.1.2) but the version available is v1.5.2 or lower. 
Available from where?

"chromite/third_party/oauth2client" is part of the chromite checkout, and should have been available for import via the path manipulation done in chromite/scripts/wrapper.py.

I don't yet understand why/how that would be broken.
Most likely import "from oauth2client import gce" failed with ImportError due to some dependency of https://cs.chromium.org/chromium/src/third_party/chromite/third_party/oauth2client/gce.py being missing, and gob_util.py then fell back to 'contrib' and failed there too.

If I have to guess, my guess is 'six' is missing.
For debugging purposes, I can log into the builder, clone chromite, and run "cbuildbot_launch --help" without error.

That means that everything necessary exists on the builder, when not inside the recipe environment.

Comment 45 by no...@chromium.org, Dec 13 2017

Although the recipe inserts system path into $PATH, I see that recipe engine still prepends a dir with non-system python:

PATH: /b/swarming/w/ir/cipd_bin_packages:/b/swarming/w/ir/cipd_bin_packages/bin:/b/swarming/w/ir/kitchen-workdir/depot_tools:/b/swarming/w/ir/kitchen-workdir/python_bin:/b/swarming/w/ir/cipd_bin_packages:/b/swarming/w/ir/cipd_bin_packages/bin:/b/swarming/cipd_cache/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
 
note first two entries. FWIU this intrusive behavior was removed in https://chromium-review.googlesource.com/c/infra/infra/+/809624/4/go/src/infra/tools/kitchen/cook.go but wasn't rolled into production yet. FWIU iannucci intends to do it very soon.
Cc: davidri...@chromium.org
Robbie's change has hit production, and swarming builds again work for ChromeOS.

https://luci-milo.appspot.com/p/chromeos/builds/b8960289718183795968
Status: Fixed (was: Started)
Thanks!
Awesome, great to hear.

Sign in to add a comment