New issue
Advanced search Search tips
Starred by 1 user
Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: ----



Sign in to add a comment
Fix moblab to work with the version of lxc provided by portage-stable
Project Member Reported by xixuan@chromium.org, Nov 28 Back to list
https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/7989

11/27 21:27:52.766 DEBUG|          base_job:0357| Persistent state global_properties.fast now set to False
11/27 21:27:52.766 DEBUG|          base_job:0357| Persistent state global_properties.max_result_size_KB now set to 20000
11/27 21:27:52.785 DEBUG|          autotemp:0116| Clean was not called for /tmp/_autotmp_aeCTpYssh-master
11/27 21:27:52.809 INFO |    connectionpool:0207| Starting new HTTP connection (1): metadata.google.internal
11/27 21:27:53.064 INFO |            config:0024| Configuration file does not exist, ignoring: /etc/chrome-infra/ts-mon.json
11/27 21:27:53.065 ERROR|            config:0244| ts_mon monitoring is disabled because the endpoint provided is invalid or not supported:
11/27 21:27:53.066 NOTIC|      cros_logging:0038| ts_mon was set up.
11/27 21:27:53.066 DEBUG|          autoserv:0264| Trying to start servod.
11/27 21:27:53.166 WARNI|          autoserv:0272| Starting servod is aborted. The dut's servo_host attribute is not set to localhost.
11/27 21:27:53.166 DEBUG|             utils:0212| Running 'sudo test -e "/mnt/moblab/containers/base_05/container_id.p"'
11/27 21:27:53.180 DEBUG|             utils:0212| Running 'sudo lxc-ls --active'
11/27 21:27:53.197 DEBUG|             utils:0212| Running 'sudo test -e "/mnt/moblab/containers/base_05/rootfs"'
11/27 21:27:53.212 DEBUG|             utils:0212| Running 'cp /usr/local/autotest/results/drone_tmp/attach.7 /usr/local/autotest/results/2-moblab/192.168.231.101/attach.7'
11/27 21:27:53.220 DEBUG|             utils:0212| Running 'sudo test -e "/mnt/moblab/containers/test_2_1511846872_15065"'
11/27 21:27:53.242 DEBUG|             utils:0212| Running 'sudo -n virt-what'
11/27 21:27:53.259 WARNI|             utils:2300| Package virt-what is not installed, default to assume it is not a virtual machine.
11/27 21:27:53.260 DEBUG|             utils:0212| Running 'sudo lxc-clone --lxcpath /mnt/moblab/containers --newpath /mnt/moblab/containers --orig base_05 --new test_2_1511846872_15065  '
11/27 21:27:53.276 DEBUG| container_factory:0102| Creating snapshot clone failed. Attempting without snapshot...
11/27 21:27:53.278 DEBUG|             utils:0212| Running 'sudo lxc-ls --active'
11/27 21:27:53.306 DEBUG|             utils:0212| Running 'sudo test -e "/mnt/moblab/containers/base_05/rootfs"'
11/27 21:27:53.326 DEBUG|             utils:0212| Running 'sudo test -e "/mnt/moblab/containers/base_05/container_id.p"'
11/27 21:27:53.342 INFO |        server_job:0218| FAIL  ----    ----    timestamp=1511846873    localtime=Nov 27 21:27:53       Failed to setup container for test: Command <sudo lxc-clone --lxcpath /mnt/moblab/containers --newpath /mnt/moblab/containers --orig base_05 --new test_2_1511846872_15065  > failed, rc=1, Command returned non-zero exit status
  * Command:
      sudo lxc-clone --lxcpath /mnt/moblab/containers --newpath
      /mnt/moblab/containers --orig base_05 --new test_2_1511846872_15065
  Exit status: 1
  Duration: 0.00917220115662

  stderr:
  sudo: lxc-clone: command not found. Check logs in ssp_logs folder for more details.
11/27 21:27:53.343 DEBUG|             utils:0212| Running 'sudo -n chown -R 246 "/usr/local/autotest/results/2-moblab/192.168.231.101"'
11/27 21:27:53.353 DEBUG|             utils:0212| Running 'sudo -n chgrp -R 246 "/usr/local/autotest/results/2-moblab/192.168.231.101"'
11/27 21:27:53.362 ERROR|         traceback:0013| Traceback (most recent call last):
11/27 21:27:53.362 ERROR|         traceback:0013|   File "/usr/local/autotest/server/autoserv", line 507, in run_autoserv
11/27 21:27:53.363 ERROR|         traceback:0013|     machines)
11/27 21:27:53.363 ERROR|         traceback:0013|   File "/usr/local/autotest/server/autoserv", line 168, in _run_with_ssp
11/27 21:27:53.363 ERROR|         traceback:0013|     dut_name=dut_name)
11/27 21:27:53.363 ERROR|         traceback:0013|   File "/usr/lib64/python2.7/site-packages/chromite/lib/metrics.py", line 483, in wrapper
11/27 21:27:53.364 ERROR|         traceback:0013|     return fn(*args, **kwargs)
11/27 21:27:53.364 ERROR|         traceback:0013|   File "/usr/local/autotest/site_utils/lxc/cleanup_if_fail.py", line 40, in func_cleanup_if_fail
11/27 21:27:53.364 ERROR|         traceback:0013|     return func(*args, **kwargs)
11/27 21:27:53.364 ERROR|         traceback:0013|   File "/usr/local/autotest/site_utils/lxc/container_bucket.py", line 153, in setup_test
11/27 21:27:53.364 ERROR|         traceback:0013|     self.container_path)
11/27 21:27:53.365 ERROR|         traceback:0013|   File "/usr/local/autotest/site_utils/lxc/container_factory.py", line 67, in create_container
11/27 21:27:53.365 ERROR|         traceback:0013|     lxc_path=lxc_path)
11/27 21:27:53.365 ERROR|         traceback:0013|   File "/usr/lib64/python2.7/site-packages/chromite/lib/metrics.py", line 483, in wrapper
11/27 21:27:53.366 ERROR|         traceback:0013|     return fn(*args, **kwargs)
11/27 21:27:53.366 ERROR|         traceback:0013|   File "/usr/local/autotest/site_utils/lxc/container_factory.py", line 100, in _create_from_base
11/27 21:27:53.366 ERROR|         traceback:0013|     cleanup=self._force_cleanup)
11/27 21:27:53.366 ERROR|         traceback:0013|   File "/usr/local/autotest/site_utils/lxc/container.py", line 223, in clone
11/27 21:27:53.367 ERROR|         traceback:0013|     new_container = cls(new_path, new_name, {}, src, snapshot)
11/27 21:27:53.367 ERROR|         traceback:0013|   File "/usr/local/autotest/site_utils/lxc/container.py", line 135, in __init__
11/27 21:27:53.367 ERROR|         traceback:0013|     self.name, snapshot)
11/27 21:27:53.367 ERROR|         traceback:0013|   File "/usr/local/autotest/site_utils/lxc/utils.py", line 88, in 
11/27 21:27:53.368 ERROR|         traceback:0013|     utils.run(cmd)
11/27 21:27:53.368 ERROR|         traceback:0013|   File "/usr/local/autotest/client/common_lib/utils.py", line 738, in run
11/27 21:27:53.369 ERROR|         traceback:0013|     "Command returned non-zero exit status")
11/27 21:27:53.369 ERROR|         traceback:0013| CmdError: Command <sudo lxc-clone --lxcpath /mnt/moblab/containers --newpath /mnt/moblab/containers --orig base_05 --new test_2_1511846872_15065  > failed, rc=1, Command returned non-zero exit status
11/27 21:27:53.369 ERROR|         traceback:0013| * Command:
11/27 21:27:53.370 ERROR|         traceback:0013|     sudo lxc-clone --lxcpath /mnt/moblab/containers --newpath
11/27 21:27:53.370 ERROR|         traceback:0013|     /mnt/moblab/containers --orig base_05 --new test_2_1511846872_15065
11/27 21:27:53.370 ERROR|         traceback:0013| Exit status: 1
11/27 21:27:53.370 ERROR|         traceback:0013| Duration: 0.00917220115662
11/27 21:27:53.371 ERROR|         traceback:0013|
11/27 21:27:53.371 ERROR|         traceback:0013| stderr:
11/27 21:27:53.371 ERROR|         traceback:0013| sudo: lxc-clone: command not found
11/27 21:27:53.378 ERROR|          autoserv:0759| Uncaught SystemExit with code 1
Traceback (most recent call last):
  File "/usr/local/autotest/server/autoserv", line 755, in main
    use_ssp)
  File "/usr/local/autotest/server/autoserv", line 562, in run_autoserv
    sys.exit(exit_code)
SystemExit: 1
11/27 21:27:53.434 DEBUG|   logging_manager:0627| Logging subprocess finished
11/27 21:27:53.434 DEBUG|   logging_manager:0627| Logging subprocess finishedclone


Suspecting there's a bad CL.

 
Cc: dshi@chromium.org
Can't find related CL except for this one: https://chromium-review.googlesource.com/c/chromiumos/overlays/portage-stable/+/784271

@dshi could you verify it's because of bad CL or guado_moblab flake?
Cc: haddowk@chromium.org
Could be, lxc-clone is an old script, replaced by lxc-copy in lxd. The lxc upgrade might remove that command completely. Lab is still on lxc 2, we need to do some test to see if lxc-copy works on lab server as well.

For moblab, it's possible we can replace lxc-clone with lxc-copy if autotest finds it's running in moblab.

+haddowk
Owner: chirantan@chromium.org
Assign to CL's owner.
Where do I find the logs from comment #1?  I uploaded a CL to replace lxc-clone with lxc-copy: https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/794876

The guado_moblab-paladin-tryjob with that CL failed: https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/paladin/builds/4559

But I can't find any logs that mention anything about lxc-clone or lxc-copy like in comment #1.  The best I've been able to find is:

Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 631, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 837, in _call_test_function
    raise error.UnhandledTestFail(e)
UnhandledTestFail: Unhandled AutoservRunError: command execution error
* Command: 
    /usr/bin/ssh -a -x   -o Protocol=2 -o StrictHostKeyChecking=no -o
    UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o
    ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4
    -l root -p 22 chromeos2-row2-rack8-host11 "export LIBC_FATAL_STDERR_=1; if
    type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\"
    \"server[stack::run_once|run_as_moblab|run] -> ssh_run(su - moblab -c
    '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan
    --build=cyan-release/R62-9901.66.0 --suite_name=dummy_server --retry=True
    --max_retries=1')\";fi; su - moblab -c
    '/usr/local/autotest/site_utils/run_suite.py --pool='' --board=cyan
    --build=cyan-release/R62-9901.66.0 --suite_name=dummy_server --retry=True
    --max_retries=1'"
Exit status: 1
Duration: 489.806571007


Which looks to me like the ssh command failed but doesn't say anything about why the underlying call to run_suite.py failed.  What's the magic location for the log from comment #1?
https://uberchromegw.corp.google.com/i/chromeos/builders/guado_moblab-paladin/builds/7989
=> [Test-Logs]: moblab_RunSuite: FAIL: Unhandled AutoservRunError: command execution error
=> https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/159006017-chromeos-test/chromeos2-row1-rack8-host1/
=> download moblab_RunSuite.tgz, extract it
=> moblab_RunSuite/sysinfo/reboot_current/mnt/moblab/results/4-moblab/192.168.231.101/ssp_logs/debug/autoserv.DEBUG
Project Member Comment 7 by bugdroid1@chromium.org, Dec 2
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d

commit 7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d
Author: Chirantan Ekbote <chirantan@chromium.org>
Date: Sat Dec 02 06:45:28 2017

project-moblab: Copy app-emulation/lxc and mask newer versions

Copy app-emulation/lxc from portage-stable into the project-moblab
directory and mask newer versions in the moblab overlay because they
break moblab.  This allows us to update the version of lxc in
portage-stable.

BUG=chromium:789062
TEST='cros tryjob --hwtest guado_moblab-paladin-tryjob'

Change-Id: I7cbf4dc445db9e7e3b38b11615b1d2bd8292094f
Signed-off-by: Chirantan Ekbote <chirantan@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/804814
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/profiles/base/package.mask
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/app-emulation/lxc/files/lxc.initd.2
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/app-emulation/lxc/files/lxc.initd.3
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/app-emulation/lxc/metadata.xml
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/app-emulation/lxc/lxc-1.0.7.ebuild
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/app-emulation/lxc/files/lxc_at.service
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/profiles/base/eapi
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/app-emulation/lxc/files/lxc-1.0.6-bash-completion.patch
[add] https://crrev.com/7aa8d5e7c02525f0544e9ed35e444f34ba0f2c9d/project-moblab/app-emulation/lxc/Manifest

Cc: chirantan@chromium.org
Owner: ----
Status: Available
Summary: Fix moblab to work with the version of lxc provided by portage-stable (was: guado_moblab-paladin failed due to "lxc-clone: command not found")
I've landed a temporary workaround to pin the version used by moblab to 1.0.7.

Changing this bug to be about fixing moblab to work with the new version.
Sign in to add a comment