Update Tast to fully support SSP |
|||||
Issue descriptionThe "tast" Autotest server test (a.k.a. tast.py) currently enables server-side packaging (SSP). We use a copy of tast.py matching the version of the system image on the DUT, but we still use a different version of various executable files that run on the shard (or wherever the server test is being run): - tast, the main Tast executable - remote_test_runner, used to execute remote test bundles - /usr/libexec/tast, containing remote test bundles and their data files I'm looking into making cbuildbot package and upload these these files after performing a build so they can be extracted into the SSP container, similar to what we do with Autotest server tests. As a pointer for myself, I think that the existing Autotest archive is named autotest_server_package.tar.bz2. It looks like it's extracted by install_ssp() in site_utils/lxc/container.py.
,
May 25 2018
Aviv gave me some helpful advice on this today. I'd been working on changes to create and upload a new tast_ssp.tar.bz2 archive containing the Tast host binaries (easy: https://crrev.com/c/1067705) and to install it into SSP containers (hard: https://crrev.com/c/1069116, https://crrev.com/c/1069117). It sounds like a much simpler approach, and one that will automatically work with upcoming lab changes, would be to include the Tast files in the existing autotest_server_package.tar.bz2 file.
,
May 29 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/dc480a443d63d199cdde7cc7fe0a61889b0ea231 commit dc480a443d63d199cdde7cc7fe0a61889b0ea231 Author: Daniel Erat <derat@chromium.org> Date: Tue May 29 02:13:08 2018 autotest: Make tast.py use host files installed from SSP. Make the tast.py server test fall back to using Tast host files installed to /usr/local/tast from autotest_server_package.tar.bz when using Server-Side Packaging. This replaces the old logic to use CIPD-installed files from /opt/infra-tools. BUG= chromium:845289 TEST=passes locally and in tryjobs CQ-DEPEND=Ied474ce906442b2a065ef7d1959f02179d38c897 Change-Id: I10c5efd176f4b6f5cdc0912a576586e8b1c2e4bb Reviewed-on: https://chromium-review.googlesource.com/1072719 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Aviv Keshet <akeshet@chromium.org> [modify] https://crrev.com/dc480a443d63d199cdde7cc7fe0a61889b0ea231/server/site_tests/tast/control.example [modify] https://crrev.com/dc480a443d63d199cdde7cc7fe0a61889b0ea231/server/site_tests/tast/control.bvt [modify] https://crrev.com/dc480a443d63d199cdde7cc7fe0a61889b0ea231/server/site_tests/tast/tast.py
,
May 29 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/626c28c2383c27d7e39ea77c9f756ff629dcb7f6 commit 626c28c2383c27d7e39ea77c9f756ff629dcb7f6 Author: Daniel Erat <derat@chromium.org> Date: Tue May 29 02:13:08 2018 cbuildbot: Add Tast to autotest_server_package.tar.bz2. Include the tast and remote_test_runner executables and test bundles and data files in a "tast" directory within the autotest_server_package.tar.bz2 file that's generated by the UploadTestArtifacts and MoblabVMTest stages. These files are needed for Server-Side Packaging in order to run versions of remote tests that match the DUT's system image. BUG= chromium:845289 TEST=updated unit test; also performed builds and verified that autotest_server_package.tar.bz2 contains a 'tast' directory with the expected files Change-Id: Ied474ce906442b2a065ef7d1959f02179d38c897 Reviewed-on: https://chromium-review.googlesource.com/1072964 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Aviv Keshet <akeshet@chromium.org> [modify] https://crrev.com/626c28c2383c27d7e39ea77c9f756ff629dcb7f6/cbuildbot/commands.py [modify] https://crrev.com/626c28c2383c27d7e39ea77c9f756ff629dcb7f6/cbuildbot/commands_unittest.py
,
May 29 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2e9bf93b0d6c545e0d554d7efa62e8b1ecf8749a commit 2e9bf93b0d6c545e0d554d7efa62e8b1ecf8749a Author: chromite-chromium-autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Tue May 29 04:03:17 2018 Roll src/third_party/chromite/ a29e19d9f..626c28c23 (1 commit) https://chromium.googlesource.com/chromiumos/chromite.git/+log/a29e19d9f75d..626c28c2383c $ git log a29e19d9f..626c28c23 --date=short --no-merges --format='%ad %ae %s' 2018-05-24 derat cbuildbot: Add Tast to autotest_server_package.tar.bz2. Created with: roll-dep src/third_party/chromite BUG= chromium:845289 The AutoRoll server is located here: https://chromite-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=chrome-os-gardeners@chromium.org Change-Id: I03b4a50b2e05d646df7c5ffb97ca020931725268 Reviewed-on: https://chromium-review.googlesource.com/1075851 Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#562333} [modify] https://crrev.com/2e9bf93b0d6c545e0d554d7efa62e8b1ecf8749a/DEPS
,
May 31 2018
The new setup is failing occasionally, and I don't understand why. In the tast.bvt server test job at http://cautotest-prod/afe/#tab_id=view_job&object_id=203998497 (started by https://luci-milo.appspot.com/buildbot/chromeos/cave-release/2245, full logs at http://stainless/browse/chromeos-autotest-results/203998497-chromeos-test/), the Python code died when it couldn't find /usr/local/tast/tast: 05/30 11:04:09.490 WARNI| test:0637| The test failed with the following exception Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 598, in _exec _cherry_pick_call(self.initialize, *args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 746, in _cherry_pick_call return func(*p_args, **p_dargs) File "/usr/local/autotest/server/site_tests/tast/tast.py", line 100, in initialize self._tast_path = self._get_path((self._TAST_PATH, self._SSP_TAST_PATH)) File "/usr/local/autotest/server/site_tests/tast/tast.py", line 143, in _get_path raise error.TestFail('None of %s exist' % list(paths)) TestFail: None of ['/usr/bin/tast', '/usr/local/tast/tast'] exist When I look at the SSP archive at https://storage.cloud.google.com/chromeos-image-archive/cave-release/R69-10736.0.0/autotest_server_package.tar.bz2, I can see that it contains the expected tast/tast file, though. I don't see any obvious problems in ssp_logs/debug/autoserv.DEBUG, and it looks like the SSP archive is being extracted, at least according to that log. Aviv/Allen/Prathmesh, have any of you seen something like this before? I'm seeing a lot of failures like this, but also a lot of successful runs. It looks like this is happening consistently on cave-release: http://stainless/search?view=list&first_date=2018-05-29&last_date=2018-05-31&builder_name=cave-release&test=%5Etast%5C.bvt%24&exclude_cts=false&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false but not on eve-release: http://stainless/search?view=list&first_date=2018-05-29&last_date=2018-05-31&builder_name=eve-release&test=%5Etast%5C.bvt%24&exclude_cts=false&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false I have no idea what could be different there.
,
May 31 2018
I don't understand what could account for the discrepancy. I see in the ssp setup logs at https://storage.cloud.google.com/chromeos-autotest-results/203998497-chromeos-test/chromeos6-row2-rack18-host10/ssp_logs/debug/autoserv.DEBUG that a tarbal was extracted: 05/30 11:02:12.669 DEBUG| utils:0215| Running 'sudo mv /tmp/autotest_server_package.tar.bz2_bDNQ7Q /usr/local/autotest/containers/host/container._591bZ/usr/local/tmpnp9OaB/autotest_server_package.tar.bz2' 05/30 11:02:13.291 DEBUG| utils:0215| Running 'sudo tar -xvf /usr/local/autotest/containers/host/container._591bZ/usr/local/tmpnp9OaB/autotest_server_package.tar.bz2 -C /usr/local/autotest/containers/host/container._591bZ/usr/local' I don't know where /tmp/autotest_server_package.tar.bz2_bDNQ7Q came from, and it no longer exists on the shard machine in question (cros-full-0010). Proposal: we can log the md5 sum of this file as we are untarring it, just to give ourselves a breadcrumb for the theory that we are somehow using the wrong version of the tarball...
,
May 31 2018
Oh, I see cros-full-0010 used container pool: 05/30 11:02:08.399 DEBUG| container_bucket:0297| Retrieved container from pool: container._591bZ whereas for the eve-release test you mentioned, looking at its ssp setup log at https://storage.cloud.google.com/chromeos-autotest-results/204378578-chromeos-test/chromeos6-row4-rack11-host2/ssp_logs/debug/autoserv.DEBUG I see no such log line. My theory is that using container_pool is somehow causing the wrong tarball to be installed in the container, or the tarball to fail installation somehow.
,
May 31 2018
+jkop I believe we decided to cut our losses with container_pool as the gains were not as much as we hoped, and it will be obsoleted by skylab bots. Can you turn off container pool everywhere?
,
May 31 2018
,
Jun 2 2018
Thanks for fixing issue 848458, Aviv -- I'm not seeing the errors described in #6 anymore. I'm going to keep this open to track removing the now-unneeded code used to generate and install the tast-cmd and tast-remote-tests-cros CIPD packages.
,
Jun 4 2018
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/manifest-internal/+/88df8cfaf5af34e4620ff4adc8e9d3aa60a2f725 commit 88df8cfaf5af34e4620ff4adc8e9d3aa60a2f725 Author: Daniel Erat <derat@chromium.org> Date: Mon Jun 04 22:19:46 2018
,
Jun 6 2018
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/910c207676f396d9a8ba2600cac49b0cf0a45e6c commit 910c207676f396d9a8ba2600cac49b0cf0a45e6c Author: Daniel Erat <derat@chromium.org> Date: Wed Jun 06 02:37:13 2018
,
Jun 8 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/81db3d6b58449255fabc1697c12cd168a589210b commit 81db3d6b58449255fabc1697c12cd168a589210b Author: Daniel Erat <derat@chromium.org> Date: Fri Jun 08 02:56:14 2018 cbuildbot: Stop building Tast infra packages. Make the EmergeInfraGoBinariesStage, PackageInfraGoBinariesStage, and RegisterInfraGoPackagesStage cbuildbot stages stop building the tast-cmd and tast-remote-tests-cros CIPD packages. These are no longer needed now that Tast-related binaries are distributed using SSP. BUG= chromium:845289 , chromium:782515 TEST=none CQ-DEPEND=CL:1086030 CQ-DEPEND=CL:*635275 Change-Id: I773f7e250e2c1e09afc363b94e02fdf987f82195 Reviewed-on: https://chromium-review.googlesource.com/1086160 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [modify] https://crrev.com/81db3d6b58449255fabc1697c12cd168a589210b/cbuildbot/stages/infra_stages.py
,
Jun 8 2018
,
Jun 8 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d1de178223cdc465037bfe5e480d1499ee3be610 commit d1de178223cdc465037bfe5e480d1499ee3be610 Author: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Fri Jun 08 04:50:46 2018 Roll src/third_party/chromite 531d6f8..81db3d6 (1 commits) https://chromium.googlesource.com/chromiumos/chromite.git/+log/531d6f8..81db3d6 git log 531d6f8..81db3d6 --date=short --no-merges --format='%ad %ae %s' 2018-06-08 derat@chromium.org cbuildbot: Stop building Tast infra packages. Created with: gclient setdep -r src/third_party/chromite@81db3d6 The AutoRoll server is located here: https://chromite-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. BUG= chromium:845289 , chromium:782515 TBR=chrome-os-gardeners@chromium.org Change-Id: Ie76c8995b11c83dfb3a61c796437a7a8207d4c13 Reviewed-on: https://chromium-review.googlesource.com/1092046 Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#565533} [modify] https://crrev.com/d1de178223cdc465037bfe5e480d1499ee3be610/DEPS |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by derat@chromium.org
, May 21 2018