export_to_gcloud from BuildPackages failing |
||
Issue descriptionThe export_to_gcloud for parallel_emerge is failing, I believe from gcloud/venv issues. https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fguado_moblab-paladin%2F5499%2F%2B%2Frecipes%2Fsteps%2FBuildPackages%2F0%2Fstdout File "/b/cbuild/internal_master/chromite/bin/export_to_gcloud", line 99, in <module> main() File "/b/cbuild/internal_master/chromite/bin/export_to_gcloud", line 36, in main wrapper.DoMain() File "/b/cbuild/internal_master/chromite/scripts/wrapper.py", line 164, in DoMain commandline.ScriptWrapperMain(FindTarget) File "/b/cbuild/internal_master/chromite/lib/commandline.py", line 816, in ScriptWrapperMain target = find_target_func(target) File "/b/cbuild/internal_master/chromite/scripts/wrapper.py", line 139, in FindTarget module = cros_import.ImportModule(target) File "/b/cbuild/internal_master/chromite/lib/cros_import.py", line 43, in ImportModule module = __import__(target) File "/b/cbuild/internal_master/chromite/scripts/export_to_gcloud.py", line 9, in <module> from gcloud import datastore File "/home/chrome-bot/.cache/cros_venv/venv-2.7.6-5addca6cf590166d7b70e22a95bea4a0/local/lib/python2.7/site-packages/gcloud/__init__.py", line 19, in <module> __version__ = get_distribution('gcloud').version File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 311, in get_distribution if isinstance(dist,Requirement): dist = get_provider(dist) File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 197, in get_provider return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0] File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 666, in require needed = self.resolve(parse_requirements(requirements)) File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 565, in resolve raise DistributionNotFound(req) # XXX put more info here pkg_resources.DistributionNotFound: gcloud [1;33m15:57:10: WARNING: Unable to export to datastore: return code: 1; command: /b/cbuild/internal_master/chromite/bin/export_to_gcloud /creds/service_accounts/service-account-chromeos-datastore-writer-prod.json /b/cbuild/internal_master/buildbot_archive/guado_moblab-paladin/R59-9418.0.0-rc4/build-events.json --parent_key "('Build', 1422700, 'BuildStage', 41541274L)" cmd=['/b/cbuild/internal_master/chromite/bin/export_to_gcloud', '/creds/service_accounts/service-account-chromeos-datastore-writer-prod.json', '/b/cbuild/internal_master/buildbot_archive/guado_moblab-paladin/R59-9418.0.0-rc4/build-events.json', '--parent_key', "('Build', 1422700, 'BuildStage', 41541274L)"][0m
,
Apr 1 2017
>(and the other three instances of build data being export to datastore) Which?
,
Apr 1 2017
There's one from parallel emerge, one from report stages with upload metadata, and one from TKO parse.
,
Apr 1 2017
(Sorry, three in total, not three other)
,
Apr 1 2017
Weird:
ayatane@cros-beefy361-c2:~$ cd /b/cbuild/internal_master/chromite
ayatane@cros-beefy361-c2:/b/cbuild/internal_master/chromite$ bin/export_to_gcloud
usage: export_to_gcloud [-h]
[--log-level {fatal,critical,error,warning,notice,info,debug}]
[--log_format LOG_FORMAT] [--debug] [--nocolor]
[--project_id PROJECT_ID] [--namespace NAMESPACE]
[--parent_key PARENT_KEY]
service_acct_json entities
export_to_gcloud: error: too few arguments
,
Apr 1 2017
TKO upload isn't failing, it's just the two calls from cbuildbot workflow. It looks like something is injecting foreign dependencies into the virtualenv, which is shadowing setuptools.
,
Apr 1 2017
chrome-bot@cros-beefy361-c2:(Linux 14.04):/b/build$ PYTHONPATH=/b/build/third_party/setuptools-0.6c11/ /b/cbuil
d/internal_master/chromite/bin/export_to_gcloud
Traceback (most recent call last):
File "/b/cbuild/internal_master/chromite/bin/export_to_gcloud", line 99, in <module>
main()
File "/b/cbuild/internal_master/chromite/bin/export_to_gcloud", line 36, in main
wrapper.DoMain()
File "/b/cbuild/internal_master/chromite/scripts/wrapper.py", line 164, in DoMain
commandline.ScriptWrapperMain(FindTarget)
File "/b/cbuild/internal_master/chromite/lib/commandline.py", line 816, in ScriptWrapperMain
target = find_target_func(target)
File "/b/cbuild/internal_master/chromite/scripts/wrapper.py", line 139, in FindTarget
module = cros_import.ImportModule(target)
File "/b/cbuild/internal_master/chromite/lib/cros_import.py", line 43, in ImportModule
module = __import__(target)
File "/b/cbuild/internal_master/chromite/scripts/export_to_gcloud.py", line 9, in <module>
from gcloud import datastore
File "/home/chrome-bot/.cache/cros_venv/venv-2.7.6-5addca6cf590166d7b70e22a95bea4a0/local/lib/python2.7/site-pack
ages/gcloud/__init__.py", line 19, in <module>
__version__ = get_distribution('gcloud').version
File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 311, in get_distribution
if isinstance(dist,Requirement): dist = get_provider(dist)
File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 197, in get_provider
return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 666, in require
needed = self.resolve(parse_requirements(requirements))
File "/b/build/third_party/setuptools-0.6c11/pkg_resources.py", line 565, in resolve
raise DistributionNotFound(req) # XXX put more info here
pkg_resources.DistributionNotFound: gcloud
We should probably not be inheriting PYTHONPATH in the virtualenv.
,
Apr 1 2017
>We should probably not be inheriting PYTHONPATH in the virtualenv. 1. This statement is true, but we have been inheriting PYTHONPATH in the virtualenv since the beginning of virtualenv deployment. 2. I wonder why we're only seeing this issue now (or did we miss it before?) Nothing has changed on the virtualenv side to affect PYTHONPATH. Did something change in the environment elsewhere?
,
Apr 1 2017
davidriley: Can we surface export_to_gcloud failures somehow?
,
Apr 1 2017
I didn't mean to imply the TKO upload is failing, but I'd like whatever changes you make to verify that continues to work so we don't get into the situation where the fix breaks some other usage again. I'm not sure what you mean by surface? We could fail builds, but I do not feel comfortable failing builds when this keeps breaking and is unreliable. We want the data, but it's not critical and worth making our build infrastructure more flaky.
,
Apr 1 2017
c#10: No idea, I'm just really a consumer of all of this.
,
Apr 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/82b4840db284ab826b52aef9017c133a80590d9e commit 82b4840db284ab826b52aef9017c133a80590d9e Author: Allen Li <ayatane@chromium.org> Date: Mon Apr 03 22:41:51 2017 Scrub PYTHONPATH from virtualenv scripts BUG= chromium:707456 TEST=Run PYTHONPATH=/usr/lib/python3/dist-packages bin/export_to_gcloud Change-Id: I20ab158ead3ca3f6921d84c30ced911e685ffc1f Reviewed-on: https://chromium-review.googlesource.com/465550 Commit-Ready: Allen Li <ayatane@chromium.org> Tested-by: Allen Li <ayatane@chromium.org> Reviewed-by: David Riley <davidriley@chromium.org> [modify] https://crrev.com/82b4840db284ab826b52aef9017c133a80590d9e/scripts/virtualenv_wrapper.py
,
Apr 4 2017
Assuming this isn't a flaky error, I checked the latest moblab paladin and this error does not appear any more.
,
Apr 4 2017
Can we add tests to avoid this in the future?
,
Apr 5 2017
Tests in what sense? Ultimately, the only way to ensure that export_to_gcloud doesn't break in all of the exact environments that run it is to actually run it in all of those environments and make sure it doesn't break. Tests in the sense of regression tests to prevent this particular case from regressing, yes we can, and I recall adding unit tests for it, although a quick check suggests that I remember incorrectly.
,
Apr 5 2017
I do not want to make export_to_gcloud failures fail the build at this time, especially given how unreliable things have been. That being said, I do not want more changes to slip through which break things in undetected ways. I'm open to tests that achieve these two goals across the known usages of virtualenv.
,
Apr 6 2017
I think we should only whitelist specific errors from export_to_gcloud that we know are caused by expected flake. There are a lot of ways for it to fail that are not limited to virtualenv, and we should catch all of them except the small subset we know are bogus. Probably the best is to add a flag to export_to_gcloud that says exit with 0 for flake errors, then we can treat all non-zero exit as a real issue.
,
Apr 6 2017
No, even then I'm not sure if we want to fail the build. In particular if a change lands that causes all export_to_gcloud calls to fail on some subset of builds, is it worth failing builds? Since this data isn't critical, I don't think we should be. I'd much rather have good tests that ensure that it doesn't land in the first place, instead of using the canaries and other builders as guinea pigs. Once we have good tests where we think it's very unlikely for bad changes to slip through, then we can entertain making builds fail based on unsuccessful invocations.
,
Apr 6 2017
Put another way, more succinctly: if export_to_gcloud is the only part of a CQ run that fails, it should not be causing developers changes to get rejected by a failed CQ run.
,
Apr 6 2017
If a developer breaks export_to_gcloud, they need to fix their change. That's the point of the CQ, isn't it? I'm not talking about flake, I'm talking about any random thing that could break export_to_gcloud, like someone adding an innocuous import in a particular file that causes the house of cards to collapse. |
||
►
Sign in to add a comment |
||
Comment 1 by davidri...@chromium.org
, Apr 1 2017