failures during devserver/DUT interactions are difficult to debug |
||
Issue descriptionFor instance, asuka-release build 658 failed with this error reported in the waterfall: [Auto-Bug]: provision: FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos2-row7-rack6-host5: SSHConnectionError: ssh: connect to host chromeos2-row7-rack6-host5 port 22: Connection timed out, 10 reports https://uberchromegw.corp.google.com/i/chromeos/builders/asuka-release/builds/658 https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/90246492-chromeos-test/chromeos2-row7-rack6-host5/ The autoserv log shows this: 12/12 05:52:30.166 DEBUG| base_utils:0185| Running 'ssh 100.115.245.197 'curl "http://100.115.245.197:8082/get_au_status?build_name=asuka-release/R57-9077.0.0&force_update=True&pid=26854&host_name=chromeos2-row7-rack6-host5&full_update=False"'' 12/12 05:52:31.289 DEBUG| dev_server:1706| Current CrOS auto-update status: Transfer Devserver/Stateful Update Package 12/12 05:52:41.340 DEBUG| base_utils:0185| Running 'ssh 100.115.245.197 'curl "http://100.115.245.197:8082/get_au_status?build_name=asuka-release/R57-9077.0.0&force_update=True&pid=26854&host_name=chromeos2-row7-rack6-host5&full_update=False"'' 12/12 05:52:42.463 DEBUG| dev_server:1706| Current CrOS auto-update status: Transfer Devserver/Stateful Update Package 12/12 05:52:52.515 DEBUG| base_utils:0185| Running 'ssh 100.115.245.197 'curl "http://100.115.245.197:8082/get_au_status?build_name=asuka-release/R57-9077.0.0&force_update=True&pid=26854&host_name=chromeos2-row7-rack6-host5&full_update=False"'' 12/12 05:52:53.589 DEBUG| dev_server:1768| Failed to trigger auto-update process on devserver 12/12 05:52:53.591 DEBUG| base_utils:0185| Running 'ssh 100.115.245.197 'curl "http://100.115.245.197:8082/handler_cleanup?pid=26854&host_name=chromeos2-row7-rack6-host5"'' 12/12 05:52:54.748 DEBUG| base_utils:0185| Running 'ssh 100.115.245.197 'curl "http://100.115.245.197:8082/collect_cros_au_log?pid=26854&host_name=chromeos2-row7-rack6-host5"'' 12/12 05:52:55.870 DEBUG| dev_server:1600| Saving auto-update logs into /usr/local/autotest/results/hosts/chromeos2-row7-rack6-host5/59139905-provision/autoupdate_logs/CrOS_update_chromeos2-row7-rack6-host5_26854.log 12/12 05:52:55.872 DEBUG| dev_server:1867| Exception raised on auto_update attempt #1: Traceback (most recent call last): File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 227, in TriggerAU self._WriteAUStatus(cros_update_progress.FINISHED) File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 542, in __exit__ self.device.Cleanup() File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 665, in Cleanup self.BaseRunCommand(cmd, **kwargs) File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 830, in BaseRunCommand return self.GetAgent().RemoteSh(cmd, **kwargs) File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 340, in RemoteSh raise SSHConnectionError(e.result.error) SSHConnectionError: ssh: connect to host chromeos2-row7-rack6-host5 port 22: Connection timed out 12/12 05:52:55.873 DEBUG| base_utils:0185| Running 'ssh 100.115.245.197 'curl "http://100.115.245.197:8082/kill_au_proc?host_name=chromeos2-row7-rack6-host5"'' 12/12 05:53:17.158 DEBUG| dev_server:1885| AU failed, trying IP instead of hostname: 100.115.230.58 12/12 05:53:17.159 DEBUG| dev_server:1830| Start CrOS auto-update for host chromeos2-row7-rack6-host5 at 2 time(s). 12/12 05:53:17.161 DEBUG| base_utils:0185| Running 'ssh 100.115.245.197 'curl "http://100.115.245.197:8082/cros_au?build_name=asuka-release/R57-9077.0.0&async=True&force_update=True&host_name=100.115.230.58&full_update=False"'' 12/12 05:53:18.313 INFO | dev_server:1650| Received response from devserver for cros_au call: '[true, 29863]' 12/12 05:53:18.315 DEBUG| dev_server:1765| start process 29863 for auto_update in devserver (then it times out again) The "get_au_status" queries report "Transfer Devserver/Stafeful Update Package". But is the package being transferred INTO the devserver or FROM the devserver into the DUT? (I would think the latter, but cannot tell for sure.) Also, is the transfer making any progress, or is it just sitting there waiting for the timeout? It's possible that the DUT was completely dead at the time, in which case there isn't much to do. But couldn't that be clearly reported? Or is it clear enough but I don't get it?
,
Dec 14 2016
Every message here are all 'devserver does something to the DUT' since it's the process of 'devserver triggers auto-update on a DUT'. Before triggering the auto-update, devserver will first stage packages from google storage. It's designed that we should check the log in autoupdate_logs (CrOS_update_chromeos2-row7-rack6-host5_26854.log in this case) to see how is the transfer going. 2016/12/12 05:48:43.676 DEBUG| cros_build_lib:0564| RunCommand: rsync --perms --verbose --times --compress --omit-dir-times --exclude .svn --links --rsync-path 'PATH=/usr/local/bin:/usr/local/sbin:$PATH rsync' --recursive --rsh 'ssh -p 22 -oConnectionAttempts=4 -oUserKnownHostsFile=/dev/null -oProtocol=2 -oConnectTimeout=30 -oServerAliveCountMax=3 -oStrictHostKeyChecking=no -oServerAliveInterval=10 -oNumberOfPasswordPrompts=0 -oIdentitiesOnly=yes -i /tmp/ssh-tmpPf8ogP/testing_rsa' /home/chromeos-test/chromiumos/src/platform/dev/stateful_update '[root@chromeos2-row7-rack6-host5]:/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tsePkrVsoO/' ssh: connect to host chromeos2-row7-rack6-host5 port 22: Connection timed out rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(226) [sender=3.1.0] 2016/12/12 05:50:46.853 DEBUG| retry_util:0137| <class 'chromite.lib.cros_build_lib.RunCommandError'>(return code: 12; command: rsync --perms --verbose --times --compress --omit-dir-times --exclude .svn --links --rsync-path 'PATH=/usr/local/bin:/usr/local/sbin:$PATH rsync' --recursive --rsh 'ssh -p 22 -oConnectionAttempts=4 -oUserKnownHostsFile=/dev/null -oProtocol=2 -oConnectTimeout=30 -oServerAliveCountMax=3 -oStrictHostKeyChecking=no -oServerAliveInterval=10 -oNumberOfPasswordPrompts=0 -oIdentitiesOnly=yes -i /tmp/ssh-tmpPf8ogP/testing_rsa' /home/chromeos-test/chromiumos/src/platform/dev/stateful_update '[root@chromeos2-row7-rack6-host5]:/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tsePkrVsoO/' Could not copy /home/chromeos-test/chromiumos/src/platform/dev/stateful_update to device.)
,
Dec 14 2016
Thank you Xixuan. I had not noticed the autoupdate_logs.tgz file in the bucket. I am not sure that I can suggest a simple solution to this, so please feel free to close. I am leaving it open for the moment in case anybody has further comments.
,
Jan 4 2017
|
||
►
Sign in to add a comment |
||
Comment 1 by autumn@chromium.org
, Dec 13 2016