Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 2 users
Status: Duplicate
Merged: issue 677572
Owner:
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment
DUT lost ethernet connectivity for 24 minutes
Project Member Reported by semenzato@chromium.org, Jan 13 2017 Back to list
This veyron_speedy CQ build:

https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_speedy-paladin/builds/4083

failed because at some point it couldn't connect to a DUT while provisioning.

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/96015394-chromeos-test/chromeos4-row4-rack12-host15/debug/

autoserv.DEBUG shows that autoserv was able to connect directly to the DUT at least as late as 14:06, and as early as 14:30.  There is also indirect evidence that the devserver could initially connect to the DUT.

01/13 14:05:47.082 DEBUG|          ssh_host:0177| Running (ssh) 'true'
01/13 14:05:47.494 DEBUG|      abstract_ssh:0577| Host chromeos4-row4-rack12-host15 is now up
01/13 14:05:47.496 DEBUG|      abstract_ssh:0339| get_file. source: /tmp/sysinfo/autoserv-SbBOkq/results/default/, dest: /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435, delete_dest: False,preserve_perm: True, preserve_symlinks:True
01/13 14:05:47.496 DEBUG|      abstract_ssh:0350| Using Rsync.
01/13 14:05:47.497 DEBUG|        base_utils:0185| Running 'rsync --safe-links  --timeout=1800 --rsh='/usr/bin/ssh -a -x   -o ControlPath=/tmp/_autotmp_xTFC04ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos4-row4-rack12-host15:"/tmp/sysinfo/autoserv-SbBOkq/results/default/" "/usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435"'
01/13 14:05:48.522 DEBUG|          base_job:0246| Importing client.steps from state file /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435/control.autoserv.state
01/13 14:05:48.523 DEBUG|          base_job:0246| Importing client.harness from state file /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435/control.autoserv.state
01/13 14:05:48.524 DEBUG|          base_job:0243| Value of client.sysinfo is unchanged, skipping import
01/13 14:05:48.524 DEBUG|          base_job:0246| Importing client._record_indent from state file /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435/control.autoserv.state
01/13 14:05:48.525 DEBUG|          base_job:0243| Value of global_properties.tag is unchanged, skipping import
01/13 14:05:48.525 DEBUG|          base_job:0243| Value of global_properties.test_retry is unchanged, skipping import
01/13 14:05:48.530 DEBUG|          base_job:0392| Persistent state client.* deleted
01/13 14:05:48.534 DEBUG|      abstract_ssh:0339| get_file. source: /tmp/autoserv-Hy3WNv/sysinfo.pickle, dest: /tmp/tmpO_pvuq, delete_dest: False,preserve_perm: True, preserve_symlinks:False
01/13 14:05:48.534 DEBUG|      abstract_ssh:0350| Using Rsync.
01/13 14:05:48.535 DEBUG|        base_utils:0185| Running 'rsync -L  --timeout=1800 --rsh='/usr/bin/ssh -a -x   -o ControlPath=/tmp/_autotmp_xTFC04ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos4-row4-rack12-host15:"/tmp/autoserv-Hy3WNv/sysinfo.pickle" "/tmp/tmpO_pvuq"'
01/13 14:05:48.971 DEBUG|provision_AutoUpda:0059| Start provisioning <remote host: chromeos4-row4-rack12-host15> to veyron_speedy-paladin/R57-9182.0.0-rc1
01/13 14:05:49.011 DEBUG|        dev_server:0623| The host chromeos4-row4-rack12-host15 (100.115.195.172) is in a restricted subnet. Try to locate a devserver inside subnet 100.115.192.0:19.
01/13 14:05:49.017 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/check_health?"''
01/13 14:05:50.292 INFO |        dev_server:1005| Staging artifacts on devserver http://100.115.219.136:8082: build=veyron_speedy-paladin/R57-9182.0.0-rc1, artifacts=['full_payload', 'stateful', 'autotest_packages'], files=, archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1
01/13 14:05:50.293 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/stage?artifacts=full_payload,stateful,autotest_packages&files=&async=True&archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1"''
01/13 14:05:51.445 DEBUG|        dev_server:0947| response for RPC: 'Success'
01/13 14:05:51.446 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/is_staged?artifacts=full_payload,stateful,autotest_packages&files=&archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1"''
01/13 14:05:52.786 DEBUG|        dev_server:0903| whether artifact is staged: 'False'
01/13 14:05:53.334 DEBUG|          monitors:0191| ts_mon: sending 9 metrics to PubSub
01/13 14:05:57.813 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/is_staged?artifacts=full_payload,stateful,autotest_packages&files=&archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1"''
01/13 14:05:58.883 DEBUG|        dev_server:0903| whether artifact is staged: 'False'
01/13 14:06:03.903 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/is_staged?artifacts=full_payload,stateful,autotest_packages&files=&archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1"''
01/13 14:06:05.118 DEBUG|        dev_server:0903| whether artifact is staged: 'False'
01/13 14:06:10.143 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/is_staged?artifacts=full_payload,stateful,autotest_packages&files=&archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1"''
01/13 14:06:11.378 DEBUG|        dev_server:0903| whether artifact is staged: 'False'
01/13 14:06:16.406 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/is_staged?artifacts=full_payload,stateful,autotest_packages&files=&archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1"''
01/13 14:06:17.850 DEBUG|        dev_server:0903| whether artifact is staged: 'True'
01/13 14:06:17.854 INFO |        dev_server:1024| Finished staging artifacts: build=veyron_speedy-paladin/R57-9182.0.0-rc1, artifacts=['full_payload', 'stateful', 'autotest_packages'], files=, archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1
01/13 14:06:17.856 INFO |        dev_server:1343| Requesting contents from devserver http://100.115.219.136:8082 for image veyron_speedy-paladin/R57-9182.0.0-rc1
01/13 14:06:17.857 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/list_image_dir?archive_url=gs://chromeos-image-archive/veyron_speedy-paladin/R57-9182.0.0-rc1"''
01/13 14:06:18.993 INFO |        dev_server:1348| Listing contents of :/home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1
01/13 14:06:18.995 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/test_suites.tar.bz2 Accessed: 2017-01-13 14:03:38.269040 Modified: 2017-01-13 14:03:38.165035 Size: 35438 bytes.
01/13 14:06:18.996 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/staged.timestamp Accessed: 2017-01-13 14:05:54.988455 Modified: 2017-01-13 14:05:54.988455 Size: 0 bytes.
01/13 14:06:18.996 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/control_files.tar Accessed: 2017-01-13 14:03:38.713064 Modified: 2017-01-13 14:03:38.597058 Size: 4341760 bytes.
01/13 14:06:18.997 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/.test_suitesDOTtarDOTbz2 Accessed: 2017-01-13 14:03:41.837234 Modified: 2017-01-13 14:03:38.305042 Size: 21237 bytes.
01/13 14:06:18.998 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/stateful.tgz Accessed: 2017-01-13 14:06:04.572975 Modified: 2017-01-13 14:06:11.017325 Size: 275329318 bytes.
01/13 14:06:18.998 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/update.gz Accessed: 2017-01-13 14:05:41.871744 Modified: 2017-01-13 14:06:00.000727 Size: 868979161 bytes.
01/13 14:06:18.999 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/.autotest_packagesDOTtar Accessed: 2017-01-13 14:04:15.271047 Modified: 2017-01-13 14:04:15.271047 Size: 0 bytes.
01/13 14:06:18.999 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/.statefulDOTtgz Accessed: 2017-01-13 14:06:11.753365 Modified: 2017-01-13 14:06:11.665360 Size: 78 bytes.
01/13 14:06:19.000 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/.control_filesDOTtar Accessed: 2017-01-13 14:03:57.594089 Modified: 2017-01-13 14:03:38.941077 Size: 323644 bytes.
01/13 14:06:19.001 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/autotest Accessed: 2017-01-13 14:03:42.973295 Modified: 2017-01-13 14:04:13.622958 Size: 4096 bytes.
01/13 14:06:19.002 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/.STAR_full_STAR Accessed: 2017-01-13 14:06:02.304852 Modified: 2017-01-13 14:06:02.192846 Size: 75 bytes.
01/13 14:06:19.002 INFO |        dev_server:1348| Name: /home/chromeos-test/images/veyron_speedy-paladin/R57-9182.0.0-rc1/autotest_packages.tar Accessed: 2017-01-13 14:04:13.622958 Modified: 2017-01-13 14:04:10.334780 Size: 1379993600 bytes.
01/13 14:06:19.004 DEBUG|provision_AutoUpda:0106| Installing image
01/13 14:06:19.420 DEBUG|         cros_host:0688| Resolving a devserver for auto-update
01/13 14:06:19.459 DEBUG|        dev_server:0623| The host chromeos4-row4-rack12-host15 (100.115.195.172) is in a restricted subnet. Try to locate a devserver inside subnet 100.115.192.0:19.
01/13 14:06:19.464 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/check_health?"''
01/13 14:06:20.992 DEBUG|         cros_host:0738| Resolved devserver for auto-update: http://100.115.219.136:8082
01/13 14:06:20.993 DEBUG|        dev_server:1847| Start CrOS auto-update for host chromeos4-row4-rack12-host15 at 1 time(s).
01/13 14:06:20.993 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/cros_au?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&async=True&force_update=True&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:06:22.636 INFO |        dev_server:1667| Received response from devserver for cros_au call: '[true, 3439]'
01/13 14:06:22.638 DEBUG|        dev_server:1782| start process 3439 for auto_update in devserver
01/13 14:06:22.638 DEBUG|        dev_server:1688| Check the progress for auto-update process 3439
01/13 14:06:22.639 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:06:23.987 DEBUG|        dev_server:1723| Current CrOS auto-update status: CrOS update is just started.
01/13 14:06:34.022 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:06:35.469 DEBUG|        dev_server:1723| Current CrOS auto-update status: Transfer Devserver/Stateful Update Package
01/13 14:06:45.507 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:06:47.064 DEBUG|        dev_server:1723| Current CrOS auto-update status: transfer rootfs update package
01/13 14:06:53.334 DEBUG|          monitors:0191| ts_mon: sending 13 metrics to PubSub
01/13 14:06:57.114 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:06:58.938 DEBUG|        dev_server:1723| Current CrOS auto-update status: transfer rootfs update package
01/13 14:07:08.989 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:07:10.387 DEBUG|        dev_server:1723| Current CrOS auto-update status: transfer rootfs update package
01/13 14:07:20.438 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:07:21.834 DEBUG|        dev_server:1723| Current CrOS auto-update status: transfer rootfs update package
01/13 14:07:31.885 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:07:33.408 DEBUG|        dev_server:1723| Current CrOS auto-update status: transfer rootfs update package
01/13 14:07:43.459 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:07:44.649 DEBUG|        dev_server:1723| Current CrOS auto-update status: pre-setup rootfs update
01/13 14:07:54.701 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:07:55.862 DEBUG|        dev_server:1723| Current CrOS auto-update status: pre-setup rootfs update


We get stuck at pre-setup rootfs update for 10m, then apparently time out:


01/13 14:17:37.312 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:17:38.290 DEBUG|        dev_server:1723| Current CrOS auto-update status: pre-setup rootfs update
01/13 14:17:48.338 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/get_au_status?build_name=veyron_speedy-paladin/R57-9182.0.0-rc1&force_update=True&pid=3439&host_name=chromeos4-row4-rack12-host15&full_update=False"''
01/13 14:17:49.329 DEBUG|        dev_server:1785| Failed to trigger auto-update process on devserver
01/13 14:17:49.331 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/handler_cleanup?pid=3439&host_name=chromeos4-row4-rack12-host15"''
01/13 14:17:50.504 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.136 'curl "http://100.115.219.136:8082/collect_cros_au_log?pid=3439&host_name=chromeos4-row4-rack12-host15"''
01/13 14:17:51.651 DEBUG|        dev_server:1617| Saving auto-update logs into /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435/autoupdate_logs/CrOS_update_chromeos4-row4-rack12-host15_3439.log
01/13 14:17:51.652 DEBUG|        dev_server:1884| Exception raised on auto_update attempt #1:
 Traceback (most recent call last):
  File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 227, in TriggerAU
    self._WriteAUStatus(cros_update_progress.FINISHED)
  File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 561, in __exit__
    self.device.Cleanup()
  File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 684, in Cleanup
    self.BaseRunCommand(cmd, **kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 849, in BaseRunCommand
    return self.GetAgent().RemoteSh(cmd, **kwargs)
  File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 343, in RemoteSh
    raise SSHConnectionError(e.result.error)
SSHConnectionError: ssh: connect to host chromeos4-row4-rack12-host15 port 22: Connection timed out

After a couple of timeouts (see full log for details) we try to restart the ssh master, and that fails too:

01/13 14:18:40.840 DEBUG|          monitors:0191| ts_mon: sending 14 metrics to PubSub
01/13 14:18:40.842 DEBUG|   logging_manager:0627| Logging subprocess finished
01/13 14:18:40.855 DEBUG|   logging_manager:0627| Logging subprocess finished
01/13 14:18:40.865 DEBUG|          ssh_host:0177| Running (ssh) 'rm -rf /tmp/sysinfo/autoserv-SbBOkq'
01/13 14:19:40.841 DEBUG|          monitors:0191| ts_mon: sending 15 metrics to PubSub
01/13 14:30:19.790 DEBUG|        base_utils:0280| [stderr] mux_client_request_session: read from master failed: Broken pipe
01/13 14:30:20.119 DEBUG|          ssh_host:0177| Running (ssh) 'rm -rf "/tmp/sysinfo/autoserv-SbBOkq"'
01/13 14:30:20.136 INFO |      abstract_ssh:0774| Master ssh connection to chromeos4-row4-rack12-host15 is down.
01/13 14:30:20.136 DEBUG|      abstract_ssh:0735| Nuking master_ssh_job.
01/13 14:30:20.137 DEBUG|      abstract_ssh:0741| Cleaning master_ssh_tempdir.
01/13 14:30:20.138 INFO |      abstract_ssh:0788| Starting master ssh connection '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_QO521sssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row4-rack12-host15'
01/13 14:30:20.138 DEBUG|        base_utils:0185| Running '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_QO521sssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row4-rack12-host15'
01/13 14:30:21.229 DEBUG|          ssh_host:0177| Running (ssh) 'rm -rf "/tmp/autoserv-Hy3WNv"'
01/13 14:30:21.631 DEBUG|          ssh_host:0177| Running (ssh) 'rm -rf "/tmp/autoserv-h2elNl"'
01/13 14:30:22.040 DEBUG|      abstract_ssh:0735| Nuking master_ssh_job.
01/13 14:30:23.047 DEBUG|      abstract_ssh:0741| Cleaning master_ssh_tempdir.
01/13 14:30:23.048 INFO |        server_job:0153| 		FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1484346623	localtime=Jan 13 14:30:23	Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row4-rack12-host15: SSHConnectionError: ssh: connect to host chromeos4-row4-rack12-host15 port 22: Connection timed out

then the logging manager barfs a lot of non-status lines, then we get this:


01/13 14:30:23.342 INFO |      crashcollect:0154| Collecting crash information...
01/13 14:30:23.343 DEBUG|          ssh_host:0177| Running (ssh) 'python -c 'import cPickle, glob, sys;cPickle.dump(glob.glob(sys.argv[1]), sys.stdout, 0)''
01/13 14:30:23.367 INFO |      abstract_ssh:0788| Starting master ssh connection '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_OSshP3ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row4-rack12-host15'
01/13 14:30:23.367 DEBUG|        base_utils:0185| Running '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_OSshP3ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row4-rack12-host15'
01/13 14:30:24.529 INFO | site_crashcollect:0212| There are no orphaned crashes; deleting /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435/crashinfo.chromeos4-row4-rack12-host15
01/13 14:30:24.539 DEBUG|          ssh_host:0177| Running (ssh) 'true'
01/13 14:30:24.544 DEBUG|          monitors:0191| ts_mon: sending 16 metrics to PubSub
01/13 14:30:24.962 INFO |      crashcollect:0206| chromeos4-row4-rack12-host15 already up, collecting crash info
01/13 14:30:24.963 INFO |      crashcollect:0068| Collecting /var/tmp/messages.autotest_start...
01/13 14:30:24.964 DEBUG|          ssh_host:0177| Running (ssh) 'ls -ld /var/tmp/messages.autotest_start | cut -d" " -f5'
01/13 14:30:25.413 DEBUG|        base_utils:0280| [stdout] 453179
01/13 14:30:25.453 DEBUG|      abstract_ssh:0339| get_file. source: /var/tmp/messages.autotest_start, dest: /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435/crashinfo.chromeos4-row4-rack12-host15/messages.at_start, delete_dest: False,preserve_perm: False, preserve_symlinks:False
01/13 14:30:25.454 DEBUG|          ssh_host:0177| Running (ssh) 'rsync --version'
01/13 14:30:25.921 DEBUG|      abstract_ssh:0350| Using Rsync.
01/13 14:30:25.923 DEBUG|        base_utils:0185| Running 'rsync -L  --timeout=1800 --rsh='/usr/bin/ssh -a -x   -o ControlPath=/tmp/_autotmp_OSshP3ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos4-row4-rack12-host15:"/var/tmp/messages.autotest_start" "/usr/local/autotest/results/hosts/chromeos4-row4-rack12-host15/2337579-provision/20171301140435/crashinfo.chromeos4-row4-rack12-host15/messages.at_start"'
01/13 14:30:26.515 INFO |      crashcollect:0068| Collecting /var/log/messages...
01/13 14:30:26.516 DEBUG|          ssh_host:0177| Running (ssh) 'ls -ld /var/log/messages | cut -d" " -f5'
01/13 14:30:26.928 DEBUG|        base_utils:0280| [stdout] 690582

so the DUT is clearly up again.





 
Owner: dgarr...@chromium.org
As often is the case, the syslog in the bucket ends just before the interesting time interval.

Furthermore, the DUT doesn't ping right now.

Is the DUT bad?

Comment 2 by snanda@chromium.org, Jan 13 2017
console-ramoops indicates unhappiness with eth0 perhaps?

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/96015394-chromeos-test/chromeos4-row4-rack12-host15/crashinfo.chromeos4-row4-rack12-host15/


[ 1074.233132] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:02
[ 1074.233475] usbcore: registered new interface driver smsc95xx
[ 1074.321719] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1074.321961] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1104.529930] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1104.831550] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1104.831928] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1164.942310] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1165.001584] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1165.002480] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1195.188579] usbcore: deregistering interface driver smsc95xx
[ 1195.188701] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[ 1195.188768] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1196.244306] usb 2-1.1: authorized to connect
[ 1196.252798] smsc95xx v1.0.4
[ 1196.312655] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:02
[ 1196.312843] usbcore: registered new interface driver smsc95xx
[ 1196.391713] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1196.392219] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1226.510472] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1226.841554] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1226.841955] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Comment 3 by snanda@chromium.org, Jan 13 2017
We may want to merge into the other ethernet interface not coming up bug you have Luigi.

One thought I was having was to dump some ethtool information in the check_ethernet.hook to help understand what could be going on here. Thoughts? 

https://chromium.googlesource.com/chromiumos/platform/crostestutils/+/refs/heads/stabilize-7834.66.B/recover_duts/hooks/check_ethernet.hook
Cc: haoweiw@chromium.org xixuan@chromium.org johndhong@chromium.org
The lab team is telling me they have sporadic network issues today that they are trying to track down.
Thank you Sameer.  I forgot to check the crashinfo folder.  I find it hard to forgive myself for this :)

That issue is bug 677572.  I am not duplicating because that one happened at reboot, but it could very well be related.

Don: thank you for the info, but these issues seem to be device specific.  The interface is not coming up when it should.  This is not very likely to be due to external network events.
Cc: -johndhong@chromium.org englab-sys-cros@google.com
I suppose it's possibly a bad NIC?
Thank you John, no we think the NIC is fine but the networking software had a glitch.
Comment 8 by snanda@chromium.org, Jan 14 2017
Actually for this particular instance the NIC (or cable) _may_ be at faul since the link is not coming up:

[ 1196.312843] usbcore: registered new interface driver smsc95xx
[ 1196.391713] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1196.392219] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1226.510472] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1226.841554] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1226.841955] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

Owner: dshi@chromium.org
Passing to current deputy to decide if there is any more useful investigation to do.
Owner: semenzato@chromium.org
Summary: DUT lost ethernet connectivity for 24 minutes (was: what was chromeos4-row4-rack12-host15 doing between 14:06 and 14:30?)
Mergedinto: 677572
Status: Duplicate
Merging into a single bug since it seems the same.
Sign in to add a comment