New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 755267 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Last visit > 30 days ago
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Change global_config to enable ssh tunnel for servo by default

Project Member Reported by ka...@chromium.org, Aug 14 2017

Issue description

Not sure when this started - I had recovered my Goobuntu station few weeks ago and was OOO for the past week.
Tried from other Test team members machines, and failed same way.

DUT and servo are ping-able and I am able to login through ssh successfully.

test_that fails



(cr) ((321e2f8...)) kalin@kalin ~/trunk/src/scripts $ test_that --autotest_dir ~/trunk/src/third_party/autotest/files/ --board=peppy  chromeos1-row1-rack4-host2 platform_ExternalUsbPeripherals.detect.login_unplug_plug
INFO:root:Identity added: /tmp/test_that_results_jWIJfB/testing_rsa (/tmp/test_that_results_jWIJfB/testing_rsa)
12:36:00 INFO | Began logging to /tmp/test_that_results_jWIJfB
Adding labels [u'cros-version:ad_hoc_build', u'board:peppy'] to host chromeos1-row1-rack4-host2
14:36:00 INFO | Fetching suite for job named platform_ExternalUsbPeripherals.detect.login_unplug_plug...
14:36:05 INFO | Scheduling suite for job named platform_ExternalUsbPeripherals.detect.login_unplug_plug...
14:36:05 INFO | ... scheduled 1 job(s).
14:36:06 INFO | autoserv| Results placed in /tmp/test_that_results_jWIJfB/results-1-platform_ExternalUsbPeripherals.detect.login_unplug_plug
14:36:06 INFO | autoserv| Logged pid 12915 to /tmp/test_that_results_jWIJfB/results-1-platform_ExternalUsbPeripherals.detect.login_unplug_plug/.autoserv_execute
14:36:06 INFO | autoserv| Starting new HTTP connection (1): metadata.google.internal
14:36:06 INFO | autoserv| Configuration file does not exist, ignoring: /etc/chrome-infra/ts-mon.json
14:36:06 INFO | autoserv| ts_mon monitoring is disabled because the endpoint provided is invalid or not supported:
14:36:06 INFO | autoserv| ts_mon was set up.
14:36:06 INFO | autoserv| I am PID 12915
14:36:06 INFO | autoserv| Starting master ssh connection '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_toD8OWssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos1-row1-rack4-host2'
14:36:11 INFO | autoserv| get_network_stats: at-start RXbytes 6948907 TXbytes 871208
14:36:11 INFO | autoserv| Not checking if job_repo_url contains autotest packages on ['chromeos1-row1-rack4-host2']
14:36:11 INFO | autoserv| Processing control file
14:36:11 INFO | autoserv| Verifying this condition: host is available via ssh
14:36:11 INFO | autoserv| Starting master ssh connection '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_dJ3Fdvssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos1-row1-rack4-host2-servo'
14:36:12 INFO | autoserv| No failed triggers, skipping repair:  Power cycle the host with RPM
14:36:12 INFO | autoserv| Verifying this condition: servo BOARD setting is correct
14:36:12 INFO | autoserv| Verifying this condition: servo SERIAL setting is correct
14:36:12 INFO | autoserv| Verifying this condition: servod upstart job is running
14:36:12 INFO | autoserv| Verifying this condition: servod service is taking calls
14:36:12 INFO | autoserv| Failed: servod service is taking calls
14:36:12 INFO | autoserv| Traceback (most recent call last):
14:36:12 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/client/common_lib/hosts/repair.py", line 329, in _verify_host
14:36:12 INFO | autoserv| self.verify(host)
14:36:12 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/server/hosts/servo_repair.py", line 211, in verify
14:36:12 INFO | autoserv| host.connect_servo()
14:36:12 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/server/hosts/servo_host.py", line 133, in connect_servo
14:36:12 INFO | autoserv| timeout_sec=self.INITIALIZE_SERVO_TIMEOUT_SECS)
14:36:12 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/client/common_lib/cros/retry.py", line 123, in timeout
14:36:12 INFO | autoserv| default_result = func(*args, **kwargs)
14:36:12 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/server/cros/servo/servo.py", line 218, in initialize_dut
14:36:12 INFO | autoserv| self._server.hwinit()
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1240, in __call__
14:36:12 INFO | autoserv| return self.__send(self.__name, args)
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1599, in __request
14:36:12 INFO | autoserv| verbose=self.__verbose
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1280, in request
14:36:12 INFO | autoserv| return self.single_request(host, handler, request_body, verbose)
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1308, in single_request
14:36:12 INFO | autoserv| self.send_content(h, request_body)
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1456, in send_content
14:36:12 INFO | autoserv| connection.endheaders(request_body)
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 1049, in endheaders
14:36:12 INFO | autoserv| self._send_output(message_body)
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 893, in _send_output
14:36:12 INFO | autoserv| self.send(msg)
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 855, in send
14:36:12 INFO | autoserv| self.connect()
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 832, in connect
14:36:12 INFO | autoserv| self.timeout, self.source_address)
14:36:12 INFO | autoserv| File "/usr/lib64/python2.7/socket.py", line 575, in create_connection
14:36:12 INFO | autoserv| raise err
14:36:12 INFO | autoserv| error: [Errno 113] No route to host
14:36:12 INFO | autoserv| Skipping this operation: pwr_button control is normal
14:36:12 INFO | autoserv| Skipping this operation: lid_open control is normal
14:36:12 INFO | autoserv| Attempting this repair action: Start servod with the proper config settings.
14:36:13 INFO | autoserv| Board for DUT is unknown; starting servod assuming a pre-configured board.
14:36:33 INFO | autoserv| Verifying this condition: host is available via ssh
14:36:33 INFO | autoserv| Verifying this condition: servo BOARD setting is correct
14:36:33 INFO | autoserv| Verifying this condition: servo SERIAL setting is correct
14:36:33 INFO | autoserv| Verifying this condition: servod upstart job is running
14:36:34 INFO | autoserv| Verifying this condition: servod service is taking calls
14:36:34 INFO | autoserv| Failed: servod service is taking calls
14:36:34 INFO | autoserv| Traceback (most recent call last):
14:36:34 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/client/common_lib/hosts/repair.py", line 329, in _verify_host
14:36:34 INFO | autoserv| self.verify(host)
14:36:34 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/server/hosts/servo_repair.py", line 211, in verify
14:36:34 INFO | autoserv| host.connect_servo()
14:36:34 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/server/hosts/servo_host.py", line 133, in connect_servo
14:36:34 INFO | autoserv| timeout_sec=self.INITIALIZE_SERVO_TIMEOUT_SECS)
14:36:34 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/client/common_lib/cros/retry.py", line 123, in timeout
14:36:34 INFO | autoserv| default_result = func(*args, **kwargs)
14:36:34 INFO | autoserv| File "/mnt/host/source/src/third_party/autotest/files/server/cros/servo/servo.py", line 218, in initialize_dut
14:36:34 INFO | autoserv| self._server.hwinit()
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1240, in __call__
14:36:34 INFO | autoserv| return self.__send(self.__name, args)
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1599, in __request
14:36:34 INFO | autoserv| verbose=self.__verbose
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1280, in request
14:36:34 INFO | autoserv| return self.single_request(host, handler, request_body, verbose)
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1308, in single_request
14:36:34 INFO | autoserv| self.send_content(h, request_body)
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/xmlrpclib.py", line 1456, in send_content
14:36:34 INFO | autoserv| connection.endheaders(request_body)
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 1049, in endheaders
14:36:34 INFO | autoserv| self._send_output(message_body)
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 893, in _send_output
14:36:34 INFO | autoserv| self.send(msg)
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 855, in send
14:36:34 INFO | autoserv| self.connect()
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/httplib.py", line 832, in connect
14:36:34 INFO | autoserv| self.timeout, self.source_address)
14:36:34 INFO | autoserv| File "/usr/lib64/python2.7/socket.py", line 575, in create_connection
14:36:34 INFO | autoserv| raise err
14:36:34 INFO | autoserv| error: [Errno 113] No route to host

...

14:40:53 INFO | autoserv| File "/home/kalin/trunk/src/third_party/autotest/files/server/autoserv", line 558, in run_autoserv
14:40:53 INFO | autoserv| sys.exit(exit_code)
14:40:53 INFO | autoserv| SystemExit: 1
14:40:53 INFO | autoserv| record_state_duration failed: job_or_task_id=None, hostname=chromeos1-row1-rack4-host2, status=Running
------------------------------------------------------------------------------------------------------------
/tmp/test_that_results_jWIJfB/results-1-platform_ExternalUsbPeripherals.detect.login_unplug_plug [  FAILED  ]
------------------------------------------------------------------------------------------------------------
Total PASS: 0/1 (0%)

14:40:54 ERROR| Autoserv encountered unexpected errors when executing jobs.
14:40:54 INFO | Finished running tests. Results can be found in /tmp/test_that_results_jWIJfB or /tmp/test_that_latest


Full test log is attached

 
test_log.tar
21.7 KB Download

Comment 1 by ka...@chromium.org, Aug 14 2017

I am able to run client and server side(but non-servo) tests.

Comment 2 by nxia@chromium.org, Aug 14 2017

Cc: haoweiw@chromium.org jrbarnette@chromium.org
haoweiw@ is this dut affected by  issue 753120  ?

Comment 3 by nxia@chromium.org, Aug 14 2017

Cc: pprabhu@chromium.org dgarr...@chromium.org
Cc: -jrbarnette@chromium.org -pprabhu@chromium.org prabhu@chromium.org
Components: -Infra>Client>ChromeOS
Owner: ka...@chromium.org
Status: Assigned (was: Untriaged)
Starting with this:
14:36:12 INFO | autoserv| Verifying this condition: servod service is taking calls
14:36:12 INFO | autoserv| Failed: servod service is taking calls

Also checking this:
    $ servo-stat chromeos1-row1-rack4-host2
    chromeos1-row1-rack4-host2 ...ABDEFGH not running servod BOARD=peppy CHROMEOS_RELEASE_VERSION=9672.0.0

Both of those indicate that servod has failed on the DUT's servo.

There's a reasonable chance that restarting servod on the DUT will
fix the problem, at least for a while.  That can be done manually;
it should also happen if you click 'Repair' for the DUT.

CrOS Infra doesn't maintain either the DUTs or the servos, so whatever
work is done, it needs to come from whoever maintains the 'chromeos1'
lab.

Comment 5 by ka...@chromium.org, Aug 15 2017

I remember I saw servod running on this servo host. Now I can confirm, without restarting or so actions:
$ ssh root@chromeos1-row1-rack4-host2-servo.cros
localhost ~ # ps ux | grep servod
root      1036  0.1  7.9  48116 19552 ttyO1    Ssl+ Aug14   1:36 /usr/bin/python2.7 /usr/lib/python-exec/python2.7/servod --host 0.0.0.0 --board peppy --port 9999
root      1103  0.0  5.9  29336 14680 ttyO1    S+   Aug14   0:06 /usr/bin/python2.7 /usr/lib/python-exec/python2.7/servod --host 0.0.0.0 --board peppy --port 9999
root      1104  0.0  6.0  29348 14892 ttyO1    S+   Aug14   0:03 /usr/bin/python2.7 /usr/lib/python-exec/python2.7/servod --host 0.0.0.0 --board peppy --port 9999
root     15087  0.0  0.1   1424   372 pts/1    S+   16:14   0:00 grep --colour=auto servod

localhost ~ # dut-control lid_open
lid_open:yes


Comment 6 by ka...@chromium.org, Aug 15 2017

Cc: jrbarnette@chromium.org
Owner: ----

Comment 7 by ka...@chromium.org, Aug 15 2017

Also, it is not isolated to the specific host. Same is observed when running test_that against other boards(e.g. chromeos1-row1-rack3-host2), where servod is running.

Comment 8 by dchan@chromium.org, Aug 15 2017

Owner: jrbarnette@chromium.org
looks like servo-stat script is not correct and needs to be fixed.

status servod return a different string then expected:

# status servod
servod (9999) start/running, process 15425

status servod expects 'servod start/running'

I am not sure if test_that uses status command to determine servod status.



Comment 9 by dchan@chromium.org, Aug 15 2017

looks like the problem is this

 $ dut-control -s chromeos1-row1-rack4-host2-servo -p 9999 pwr_button
No route to host

but works fine when ssh to servo
$ ssh root@chromeos1-row1-rack4-host2-servo dut-control pwr_button
pwr_button:release


somehow dut-control can locate chromeos1-row1-rack4-host2-servo.

If we are running test_that, can should have a way to by pass the check of servod and just run the tests ?

Comment 11 by dchan@chromium.org, Aug 15 2017

this might work
ssh -N -L 9999:localhost:9999 root@chromeos1-row1-rack4-host2-servo.cros.corp.google.com

$ dut-control pwr_button
pwr_button:release

Comment 12 by dchan@chromium.org, Aug 15 2017

but I am not sure how to tie that back to test_that.

Any idea ?

Comment 13 by ka...@chromium.org, Aug 21 2017

Pinging this bug.
There is still no way to run servo based server side tests.

Owner: ka...@chromium.org
> this might work
> ssh -N -L 9999:localhost:9999 root@chromeos1-row1-rack4-host2-servo.cros.corp.google.com

Autotest will do this automatically.  In the autotest directory
from where you run 'test_that', add this to shadow_config.ini
(or to global_config.ini):

enable_ssh_tunnel_for_servo: True

Assuming that works, we need to figure out why that setting isn't
present for you already.

<sigh> In global_config.ini, there's this:

# Flags to enable/disable SSH tunnel connection for servo host.
enable_ssh_tunnel_for_servo: False

Which would explain how it's set wrong when running test_that.

jrbarnett@ - should a change be submitted to update global_config.ini with enable_ssh_tunnel_for_servo: True ? I am running into the same issue but updating this value fixes the issue...

https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/global_config.ini?l=381
Components: Infra>Client>ChromeOS
Owner: ----
Status: Available (was: Assigned)
Summary: Change global_config to enable ssh tunnel for servo by default (was: Unable to run servo based server side autotests from local chroot)
OK.  I'm hearing that the workaround of changing the setting in
global_config.ini works.  As a fix, it's imperfect, but I think
that that change is the right thing to do for now.

How are folks supposed to find out about this workaround? Can we at least log something that makes this obvious?
Owner: jrbarnette@chromium.org
Status: Started (was: Available)
> How are folks supposed to find out about this workaround?

Realistically, they aren't; that's why we have to fix this bug.


> Can we at least log something that makes this obvious?

That's (substantially) more expensive than the fix:
    crosreview.com/797290

Project Member

Comment 20 by bugdroid1@chromium.org, Nov 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/20e1c47af7b77f14cafe35a42d5ec57d1ec808c0

commit 20e1c47af7b77f14cafe35a42d5ec57d1ec808c0
Author: Richard Barnette <jrbarnette@chromium.org>
Date: Thu Nov 30 09:13:15 2017

[autotest] Enable servo ssh tunnel by default

This changes the config setting for `enable_ssh_tunnel_for_servo`
to default to True.  This setting is required if the target DUT is in
the test lab.  For DUTs at a user's desktop, the setting still works,
although it's unnecessary.

BUG= chromium:755267 
TEST=None

Change-Id: I009fed8d83210f198f362ca6ae1b0ea8c4a82cba
Reviewed-on: https://chromium-review.googlesource.com/797290
Commit-Ready: Richard Barnette <jrbarnette@chromium.org>
Tested-by: Richard Barnette <jrbarnette@chromium.org>
Reviewed-by: danny chan <dchan@chromium.org>

[modify] https://crrev.com/20e1c47af7b77f14cafe35a42d5ec57d1ec808c0/global_config.ini

Status: Fixed (was: Started)
Status: Archived (was: Fixed)

Sign in to add a comment