New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 659235 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Oct 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Jobs in cassandra failing with All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']).

Project Member Reported by ka...@chromium.org, Oct 25 2016

Issue description

Comment 1 by ka...@chromium.org, Oct 25 2016

10/25 09:33:15.688 DEBUG|        dev_server:0578| The host chromeos1-row1-rack3-host3 (172.27.212.69) is in a restricted subnet. Try to locate a devserver inside subnet 172.27.212.0:22.
10/25 09:33:15.694 DEBUG|        base_utils:0185| Running 'ssh 172.27.215.248 'curl "http://172.27.215.248:8082/check_health?"''
10/25 09:34:15.694 ERROR|        dev_server:0308| Devserver call failed: "http://172.27.215.248:8082/check_health?", timeout: 60 seconds, Error: Call is timed out.
10/25 09:34:15.698 DEBUG|        base_utils:0185| Running 'ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"''
10/25 09:34:15.916 DEBUG|        dev_server:0801| Error occurred with exit_code 255 when executing the ssh call: ssh_exchange_identification: Connection closed by remote host
.
10/25 09:34:15.918 WARNI|             retry:0181| <class 'autotest_lib.client.common_lib.error.CmdError'>(Command <ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"'> failed, rc=255, Command returned non-zero exit status
* Command: 
    ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"'
Exit status: 255
Duration: 0.136443138123

stderr:
ssh_exchange_identification: Connection closed by remote host)
10/25 09:34:15.923 WARNI|             retry:0148| Retrying in 1.958943 seconds...
10/25 09:34:17.893 DEBUG|        base_utils:0185| Running 'ssh 172.27.215.252 'curl "http://172.27.215.252:8082/check_health?"''
10/25 09:35:16.893 ERROR|        dev_server:0308| Devserver call failed: "http://172.27.215.252:8082/check_health?", timeout: 60 seconds, Error: Call is timed out.
10/25 09:35:16.897 DEBUG|        base_utils:0185| Running 'ssh 172.27.215.249 'curl "http://172.27.215.249:8082/check_health?"''
10/25 09:36:16.898 ERROR|        dev_server:0308| Devserver call failed: "http://172.27.215.249:8082/check_health?", timeout: 60 seconds, Error: Call is timed out.
10/25 09:36:16.900 ERROR|        dev_server:0624| All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). dut hostname: chromeos1-row1-rack3-host3
10/25 09:36:16.901 WARNI|              test:0606| Autotest caught exception when running test:
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 600, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 804, in _call_test_function
    return func(*args, **dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 461, in execute
    dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry
    postprocess_profiled_run, args, dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 376, in _call_run_once
    self.run_once(*args, **dargs)
  File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 127, in run_once
    raise error.TestFail(str(e))
TestFail: All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). dut hostname: chromeos1-row1-rack3-host3
10/25 09:36:16.904 DEBUG|   logging_manager:0627| Logging subprocess finished
10/25 09:36:16.918 DEBUG|   logging_manager:0627| Logging subprocess finished
10/25 09:36:16.928 DEBUG|          ssh_host:0180| Running (ssh) 'rm -rf /tmp/sysinfo/autoserv-iVICty'
10/25 09:36:17.235 DEBUG|          ssh_host:0180| Running (ssh) 'rm -rf "/tmp/sysinfo/autoserv-iVICty"'
10/25 09:36:17.499 DEBUG|          ssh_host:0180| Running (ssh) 'rm -rf "/tmp/autoserv-J73Q5u"'
10/25 09:36:17.780 DEBUG|          ssh_host:0180| Running (ssh) 'rm -rf "/tmp/autoserv-AruNPH"'
10/25 09:36:18.057 DEBUG|      abstract_ssh:0713| Nuking master_ssh_job.
10/25 09:36:19.064 DEBUG|      abstract_ssh:0719| Cleaning master_ssh_tempdir.
10/25 09:36:19.066 INFO |        server_job:0153| 		FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1477413379	localtime=Oct 25 09:36:19	All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). dut hostname: chromeos1-row1-rack3-host3
10/25 09:36:19.067 INFO |        server_job:0153| 	END FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1477413379	localtime=Oct 25 09:36:19	
10/25 09:36:19.068 ERROR|           control:0071| 
Traceback (most recent call last):
  File "/usr/local/autotest/results/hosts/chromeos1-row1-rack3-host3/1921995-provision/20162510093157/control.srv", line 47, in provision_machine
    provision.Provision)
  File "/usr/local/autotest/server/cros/provision.py", line 320, in run_special_task_actions
    raise SpecialTaskActionException()
SpecialTaskActionException
10/25 09:36:19.070 INFO |        server_job:0153| END FAIL	----	provision	timestamp=1477413379	localtime=Oct 25 09:36:19	
10/25 09:36:19.071 ERROR|        server_job:0788| Exception escaped control file, job aborting:
Traceback (most recent call last):
  File "/usr/local/autotest/server/server_job.py", line 780, in run
    self._execute_code(server_control_file, namespace)
  File "/usr/local/autotest/server/server_job.py", line 1280, in _execute_code
    execfile(code_file, namespace, namespace)
  File "/usr/local/autotest/results/hosts/chromeos1-row1-rack3-host3/1921995-provision/20162510093157/control.srv", line 105, in <module>
    job.parallel_simple(provision_machine, machines)
  File "/usr/local/autotest/server/server_job.py", line 604, in parallel_simple
    return_results=return_results)
  File "/usr/local/autotest/server/subcommand.py", line 93, in parallel_simple
    function(arg)
  File "/usr/local/autotest/results/hosts/chromeos1-row1-rack3-host3/1921995-provision/20162510093157/control.srv", line 96, in provision_machine
    raise Exception('')
Exception

Comment 2 by ka...@chromium.org, Oct 25 2016

Summary: Jobs in cassandra failing with All devservers are currently down: set(['http://172.27.215.252:8082', 'http://172.27.215.249:8082', 'http://172.27.215.248:8082']). (was: Jobs in cassandra failing with AutoservVerifyError: Python is missing; may be caused by powerwash)

Comment 3 by ka...@chromium.org, Oct 25 2016

Labels: Hotlist-CrOS-DevServerLoad

Comment 4 by ka...@chromium.org, Oct 25 2016

Labels: -Hotlist-CrOS-DevServerLoad LabDevServer

Comment 5 by ka...@chromium.org, Oct 25 2016

Cc: dschimmels@chromium.org

Comment 6 by xixuan@chromium.org, Oct 25 2016

Cc: jashur@chromium.org
Owner: jashur@chromium.org
It's devserver maintained by android labs, 

cc @jashur @dschimmels, assign to @jashur

Comment 7 by jashur@chromium.org, Oct 25 2016

All devservers are up. We moved the devserver to Crux since Cassandra may have been having power issues.

Comment 8 by ka...@chromium.org, Oct 25 2016

Status: Verified (was: Untriaged)
This issue is resolved.

Sign in to add a comment