New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 701553 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

test_push: repair failing because DUT doesn't have rsync after powerwash

Project Member Reported by pprabhu@chromium.org, Mar 14 2017

Issue description

The failure is here: http://chromeos-shard2-staging.hot.corp.google.com/results/hosts/chromeos4-row10-rack9-host15/577-repair/

Looking at the autoupdate logs:
2017/03/14 14:34:54.187 INFO |      auto_updater:0489| Copying devserver package to device...
2017/03/14 14:34:54.225 DEBUG|    cros_build_lib:0564| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmp_iwJhE/testing_rsa root@100.115.201.101 -- mkdir -p /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.f4V7F9rqc1
Warning: Permanently added '100.115.201.101' (RSA) to the list of known hosts.
2017/03/14 14:34:54.411 DEBUG|    cros_build_lib:0564| RunCommand: rsync --perms --verbose --times --compress --omit-dir-times --exclude .svn --links --rsync-path 'PATH=/usr/local/bin:/usr/local/sbin:$PATH  rsync' --recursive --rsh 'ssh -p 22 -oConnectionAttempts=4 -oUserKnownHostsFile=/dev/null -oProtocol=2 -oConnectTimeout=30 -oServerAliveCountMax=3 -oStrictHostKeyChecking=no -oServerAliveInterval=10 -oNumberOfPasswordPrompts=0 -oIdentitiesOnly=yes -i /tmp/ssh-tmp_iwJhE/testing_rsa' /tmp/cros-update_100.115.201.101_23625/src '[root@100.115.201.101]:/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.f4V7F9rqc1/'
Warning: Permanently added '100.115.201.101' (RSA) to the list of known hosts.
bash: rsync: command not found
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: remote command not found (code 127) at io.c(226) [sender=3.1.0]


It looks like rsync wasn't available on the remote 


My first guess is that we changed this behaviour here:
https://chromium-review.googlesource.com/c/450886/9/lib/remote_access.py#704

Maybe we were falling back to scp in this case because mode=None and the self.HasRsync() would return False (it checks if remote has rsync)

I think the right thing to do here is to check self.HasRsync even when mode=rsync and fallback to scp as before.

Assigning to ihf@ (his CL), adding deputy to CC because this may be hapenning in prod as well.
 

Comment 1 by ihf@chromium.org, Mar 14 2017

Can I have a backtrace? I can't access your link.
Updated (permanent) link to logs:
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos4-row10-rack9-host15/581-repair/20171403143726/

03/14 14:41:12.336 ERROR|            repair:0449| Repair failed: Powerwash and then re-install the stable build via AU
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 447, in _repair_host
    self.repair(host)
  File "/usr/local/autotest/server/hosts/cros_repair.py", line 349, in repair
    super(PowerWashRepair, self).repair(host)
  File "/usr/local/autotest/server/hosts/cros_repair.py", line 330, in repair
    afe_utils.machine_install_and_update_labels(host, repair=True)
  File "/usr/local/autotest/server/afe_utils.py", line 206, in machine_install_and_update_labels
    *args, **dargs)
  File "/usr/local/autotest/server/hosts/cros_host.py", line 742, in machine_install_by_devserver
    force_update=force_update, full_update=force_full_update)
  File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 2087, in auto_update
    raise DevServerException(error_msg % (host_name, error_list[0]))
DevServerException: CrOS auto-update failed for host chromeos4-row10-rack9-host15: Could not copy /tmp/cros-update_chromeos4-row10-rack9-host15_18021/src to device.

Comment 4 by ihf@chromium.org, Mar 14 2017

Cc: marc...@chromium.org
Status: Started (was: Assigned)
chromeos-shard2-staging does have rsync. You are saying some lxc base images don't have rsync?

Yes, I did explicitly request rsync for text file transfers, which had the leisure to use scp before. I can change it so we fall back again to scp even though we request rsync.

Comment 5 by ihf@chromium.org, Mar 14 2017

Logs say this happens in TransferDevServerPackage(). I will change CopyToDevice() to fall back.

Comment 6 by xixuan@chromium.org, Mar 14 2017

It's devserver which kicks off the rsync. So the server that has problem with rsync is 100.115.219.132.

The output of rsync is:
rsync  version 3.1.0  protocol version 31
Copyright (C) 1996-2013 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
...
...
     --read-batch=FILE       read a batched update from FILE
     --protocol=NUM          force an older protocol version to be used
     --iconv=CONVERT_SPEC    request charset conversion of filenames
     --checksum-seed=NUM     set block/file checksum seed (advanced)
 -4, --ipv4                  prefer IPv4
 -6, --ipv6                  prefer IPv6
     --version               print version number
(-h) --help                  show this help (-h is --help only if used alone)

Use "rsync --daemon --help" to see the daemon-mode command-line options.
Please see the rsync(1) and rsyncd.conf(5) man pages for full documentation.
See http://rsync.samba.org/ for updates, bug reports, and answers
rsync error: syntax or usage error (code 1) at main.c(1556) [client=3.1.0]
Re #4: I'm saying that the DUT doesn't have rsync.

And indeed:
pprabhu@pprabhu:~$ ssh root@chromeos4-row10-rack9-host15.cros
The authenticity of host 'chromeos4-row10-rack9-host15.cros.corp.google.com (100.115.201.101)' can't be established.
RSA key fingerprint is SHA256:EBc8rsAhukHwitqKZ2EO/ucnmRrrCJQUPtQTAsmB9eo.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'chromeos4-row10-rack9-host15.cros.corp.google.com,100.115.201.101' (RSA) to the list of known hosts.
localhost ~ # rsync
-bash: rsync: command not found
localhost ~ # sudo rsync
sudo: rsync: command not found

This happened just after a powerwash test. This is expected since rsync is part of the test image and powerwash blows away stateful (where rsync is installed).

scp is always present, and we should always fallback to it.
Summary: test_push: repair failing because DUT doesn't have rsync after powerwash (was: test_push: repair failing because devserver doesn't have rsync?)
Project Member

Comment 10 by bugdroid1@chromium.org, Mar 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/3398d1f358e6c3810d3026032986075d7fd97d80

commit 3398d1f358e6c3810d3026032986075d7fd97d80
Author: Ilja H. Friedel <ihf@chromium.org>
Date: Tue Mar 14 22:23:02 2017

remote_access: check if rsync is on device.

Use scp as default to copy to device.
Use rsync as default to copy from device.
For all rsync usage check if it exists on device.

BUG= chromium:701553 
TEST=pylint

Change-Id: Ic50c8fc70b32c4a9b1cd0979e630458ba51b6f7d
Reviewed-on: https://chromium-review.googlesource.com/455237
Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>

[modify] https://crrev.com/3398d1f358e6c3810d3026032986075d7fd97d80/lib/remote_access.py

Comment 11 by ihf@chromium.org, Mar 14 2017

As for unittest request. Do we have a mock for the state of a dut? The problem here was that nobody ever called the function with parameter 'rsync', even though it announced it. So we effectively hit this with an integration test, but that was too late.

Notice that xixuan also was worried about using scp, not about using rsync. Maybe the logic cleanup + comment in my change in #8 makes the situation more obvious?

Comment 12 by ihf@chromium.org, Mar 14 2017

Can you push this change? I need to go into the lab right now and will be away from the computer for a bit.
I will start the push right now.
@xixuan: This change is needed on the devserver.

- test_push doesn't test devserver because we don't actually have a staging devserver at all.
- devservers run cros/master code, so we don't need an update to the cros/prod branch for this to go live.

So, you can just update the devservers without waiting for a test_push.
yep I always forget to tell devserver push/normal push :( 

starting devserver push.

Comment 16 by ihf@chromium.org, Mar 15 2017

Status: Fixed (was: Started)
Thank you!
Status: Verified (was: Fixed)
Pushed. Verified 100.115.219.132 and it has the changes.

Comment 18 by ihf@chromium.org, Mar 15 2017

Yes, looking good now on chromeos4-devserver3/100.115.219.131 as well.

Sign in to add a comment