Jobs in cassandra failing with All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). |
|||||||
Issue description
,
Oct 25 2016
,
Oct 25 2016
,
Oct 25 2016
,
Oct 25 2016
,
Oct 25 2016
It's devserver maintained by android labs, cc @jashur @dschimmels, assign to @jashur
,
Oct 25 2016
All devservers are up. We moved the devserver to Crux since Cassandra may have been having power issues.
,
Oct 25 2016
This issue is resolved. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by ka...@chromium.org
, Oct 25 201610/25 09:33:15.688 DEBUG| dev_server:0578| The host chromeos1-row1-rack3-host3 (172.27.212.69) is in a restricted subnet. Try to locate a devserver inside subnet 172.27.212.0:22. 10/25 09:33:15.694 DEBUG| base_utils:0185| Running 'ssh 172.27.215.248 'curl "http://172.27.215.248:8082/check_health?"'' 10/25 09:34:15.694 ERROR| dev_server:0308| Devserver call failed: "http://172.27.215.248:8082/check_health?", timeout: 60 seconds, Error: Call is timed out. 10/25 09:34:15.698 DEBUG| base_utils:0185| Running 'ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"'' 10/25 09:34:15.916 DEBUG| dev_server:0801| Error occurred with exit_code 255 when executing the ssh call: ssh_exchange_identification: Connection closed by remote host . 10/25 09:34:15.918 WARNI| retry:0181| <class 'autotest_lib.client.common_lib.error.CmdError'>(Command <ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"'> failed, rc=255, Command returned non-zero exit status * Command: ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"' Exit status: 255 Duration: 0.136443138123 stderr: ssh_exchange_identification: Connection closed by remote host) 10/25 09:34:15.923 WARNI| retry:0148| Retrying in 1.958943 seconds... 10/25 09:34:17.893 DEBUG| base_utils:0185| Running 'ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"'' 10/25 09:35:16.893 ERROR| dev_server:0308| Devserver call failed: "http://172.27.215.252:8082/check_health?", timeout: 60 seconds, Error: Call is timed out. 10/25 09:35:16.897 DEBUG| base_utils:0185| Running 'ssh 172.27.215.249 'curl "http://172.27.215.249:8082/check_health?"'' 10/25 09:36:16.898 ERROR| dev_server:0308| Devserver call failed: "http://172.27.215.249:8082/check_health?", timeout: 60 seconds, Error: Call is timed out. 10/25 09:36:16.900 ERROR| dev_server:0624| All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). dut hostname: chromeos1-row1-rack3-host3 10/25 09:36:16.901 WARNI| test:0606| Autotest caught exception when running test: Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 600, in _exec _call_test_function(self.execute, *p_args, **p_dargs) File "/usr/local/autotest/client/common_lib/test.py", line 804, in _call_test_function return func(*args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 461, in execute dargs) File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry postprocess_profiled_run, args, dargs) File "/usr/local/autotest/client/common_lib/test.py", line 376, in _call_run_once self.run_once(*args, **dargs) File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 127, in run_once raise error.TestFail(str(e)) TestFail: All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). dut hostname: chromeos1-row1-rack3-host3 10/25 09:36:16.904 DEBUG| logging_manager:0627| Logging subprocess finished 10/25 09:36:16.918 DEBUG| logging_manager:0627| Logging subprocess finished 10/25 09:36:16.928 DEBUG| ssh_host:0180| Running (ssh) 'rm -rf /tmp/sysinfo/autoserv-iVICty' 10/25 09:36:17.235 DEBUG| ssh_host:0180| Running (ssh) 'rm -rf "/tmp/sysinfo/autoserv-iVICty"' 10/25 09:36:17.499 DEBUG| ssh_host:0180| Running (ssh) 'rm -rf "/tmp/autoserv-J73Q5u"' 10/25 09:36:17.780 DEBUG| ssh_host:0180| Running (ssh) 'rm -rf "/tmp/autoserv-AruNPH"' 10/25 09:36:18.057 DEBUG| abstract_ssh:0713| Nuking master_ssh_job. 10/25 09:36:19.064 DEBUG| abstract_ssh:0719| Cleaning master_ssh_tempdir. 10/25 09:36:19.066 INFO | server_job:0153| FAIL provision_AutoUpdate provision_AutoUpdate timestamp=1477413379 localtime=Oct 25 09:36:19 All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). dut hostname: chromeos1-row1-rack3-host3 10/25 09:36:19.067 INFO | server_job:0153| END FAIL provision_AutoUpdate provision_AutoUpdate timestamp=1477413379 localtime=Oct 25 09:36:19 10/25 09:36:19.068 ERROR| control:0071| Traceback (most recent call last): File "/usr/local/autotest/results/hosts/chromeos1-row1-rack3-host3/1921995-provision/20162510093157/control.srv", line 47, in provision_machine provision.Provision) File "/usr/local/autotest/server/cros/provision.py", line 320, in run_special_task_actions raise SpecialTaskActionException() SpecialTaskActionException 10/25 09:36:19.070 INFO | server_job:0153| END FAIL ---- provision timestamp=1477413379 localtime=Oct 25 09:36:19 10/25 09:36:19.071 ERROR| server_job:0788| Exception escaped control file, job aborting: Traceback (most recent call last): File "/usr/local/autotest/server/server_job.py", line 780, in run self._execute_code(server_control_file, namespace) File "/usr/local/autotest/server/server_job.py", line 1280, in _execute_code execfile(code_file, namespace, namespace) File "/usr/local/autotest/results/hosts/chromeos1-row1-rack3-host3/1921995-provision/20162510093157/control.srv", line 105, in <module> job.parallel_simple(provision_machine, machines) File "/usr/local/autotest/server/server_job.py", line 604, in parallel_simple return_results=return_results) File "/usr/local/autotest/server/subcommand.py", line 93, in parallel_simple function(arg) File "/usr/local/autotest/results/hosts/chromeos1-row1-rack3-host3/1921995-provision/20162510093157/control.srv", line 96, in provision_machine raise Exception('') Exception