guado_moblab: CQ, AFE RPC error |
||
Issue descriptionhttps://luci-milo.appspot.com/buildbot/chromeos/guado_moblab-paladin/7587 Very weird guado_moblab failure: DatabaseError: (1146, "Table 'chromeos_autotest_db.django_session' doesn't exist") I don't see any CLs that could have caused this 10/18 15:48:43.588 INFO |monitor_db_cleanup:0261| Deleting old sessions from django_session 10/18 15:48:43.593 ERROR| monitor_db:0183| Uncaught exception, terminating monitor_db. Traceback (most recent call last): File "/usr/local/autotest/scheduler/monitor_db.py", line 172, in main_without_exception_handling dispatcher.tick() File "/usr/lib64/python2.7/site-packages/chromite/lib/metrics.py", line 483, in wrapper return fn(*args, **kwargs) File "/usr/local/autotest/scheduler/monitor_db.py", line 366, in tick self._run_cleanup() File "/usr/local/autotest/scheduler/monitor_db.py", line 269, in wrapper return func(self, *args, **kwargs) File "/usr/local/autotest/scheduler/monitor_db.py", line 393, in _run_cleanup self._periodic_cleanup.run_cleanup_maybe() File "/usr/local/autotest/scheduler/monitor_db_cleanup.py", line 48, in run_cleanup_maybe self._cleanup() File "/usr/lib64/python2.7/site-packages/chromite/lib/metrics.py", line 483, in wrapper return fn(*args, **kwargs) File "/usr/local/autotest/scheduler/monitor_db_cleanup.py", line 75, in _cleanup self._django_session_cleanup() File "/usr/local/autotest/scheduler/monitor_db_cleanup.py", line 263, in _django_session_cleanup self._db.execute(sql) File "/usr/local/autotest/database/database_connection.py", line 312, in execute results = self._backend.execute(query, parameters) File "/usr/local/autotest/database/database_connection.py", line 132, in execute parameters=parameters) File "/usr/local/autotest/database/database_connection.py", line 54, in execute self._cursor.execute(query, parameters) File "/usr/lib64/python2.7/site-packages/django/db/backends/mysql/base.py", line 130, in execute six.reraise(utils.DatabaseError, utils.DatabaseError(*tuple(e.args)), sys.exc_info()[2]) File "/usr/lib64/python2.7/site-packages/django/db/backends/mysql/base.py", line 120, in execute return self.cursor.execute(query, args) File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 205, in execute self.errorhandler(self, exc, value) File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler raise errorclass, errorvalue DatabaseError: (1146, "Table 'chromeos_autotest_db.django_session' doesn't exist")
,
Oct 19 2017
I have no idea where the RPC logs are, if they exist.
,
Oct 19 2017
dshi@ thinks it's moblab init flake crbug.com/776184
,
Oct 19 2017
|
||
►
Sign in to add a comment |
||
Comment 1 by ayatane@chromium.org
, Oct 19 2017Oops, I think that's a red herring The actual error is an AFE RPC 500. @@@ Apache logs: 127.0.0.1 - - [18/Oct/2017:15:38:21 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:38:21 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:38:43 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:38:43 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:39:24 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:39:24 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:40:44 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:40:44 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:46:15 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:46:15 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:46:35 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:46:35 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:47:15 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:47:15 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:48:35 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 127.0.0.1 - - [18/Oct/2017:15:48:35 -0700] "POST /afe/server/noauth/rpc/?method=get_hosts HTTP/1.1" 500 2424 @@@ DEBUG log: 10/18 15:46:05.301 DEBUG| abstract_ssh:0892| Full tunnel command: /usr/bin/ssh -a -x -n -N -q -L 46496:localhost:80 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos2-row1-rack8-host1 10/18 15:46:05.347 DEBUG| abstract_ssh:0900| Started ssh tunnel, local = 46496 remote = 80, pid = 1824 10/18 15:46:05.353 DEBUG| retry_util:0204| <class 'urllib2.URLError'>(<urlopen error [Errno 111] Connection refused>) 10/18 15:46:05.353 DEBUG| retry_util:0066| Retrying in 10.000000 (10.000000 + jitter 0.000000) seconds ... 10/18 15:46:15.493 DEBUG| retry_util:0204| <class 'urllib2.HTTPError'>(HTTP Error 500: INTERNAL SERVER ERROR) 10/18 15:46:15.494 DEBUG| retry_util:0066| Retrying in 20.000000 (20.000000 + jitter 0.000000) seconds ... 10/18 15:46:35.687 DEBUG| retry_util:0204| <class 'urllib2.HTTPError'>(HTTP Error 500: INTERNAL SERVER ERROR) 10/18 15:46:35.687 DEBUG| retry_util:0066| Retrying in 40.000000 (40.000000 + jitter 0.000000) seconds ... 10/18 15:47:15.879 DEBUG| retry_util:0204| <class 'urllib2.HTTPError'>(HTTP Error 500: INTERNAL SERVER ERROR) 10/18 15:47:15.879 DEBUG| retry_util:0066| Retrying in 80.000000 (80.000000 + jitter 0.000000) seconds ... 10/18 15:48:36.073 DEBUG| retry_util:0204| <class 'urllib2.HTTPError'>(HTTP Error 500: INTERNAL SERVER ERROR) 10/18 15:48:36.074 DEBUG| retry_util:0066| Retrying in 160.000000 (160.000000 + jitter 0.000000) seconds ... 10/18 15:51:05.351 DEBUG| moblab_host:0310| AFE is not responding 10/18 15:51:05.352 WARNI| test:0612| The test failed with the following exception Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 573, in _exec _cherry_pick_call(self.initialize, *args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 721, in _cherry_pick_call return func(*p_args, **p_dargs) File "/usr/local/autotest/server/cros/moblab_test.py", line 37, in initialize self._host.wait_afe_up() File "/usr/local/autotest/server/hosts/moblab_host.py", line 176, in wait_afe_up self._check_afe() File "/usr/local/autotest/server/hosts/moblab_host.py", line 308, in _check_afe self.afe.get_hosts() File "/usr/local/autotest/server/frontend.py", line 527, in get_hosts hosts = self.run('get_hosts', **query_args) File "/usr/local/autotest/server/cros/dynamic_suite/frontend_wrappers.py", line 126, in run self, call, **dargs) File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 244, in GenericRetry return _run() File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 171, in _Wrapper self._retry_delay.Sleep(attempt) File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 67, in Sleep time.sleep(total) File "/usr/local/autotest/site-packages/chromite/lib/timeout_util.py", line 88, in kill_us raise TimeoutError(error_message % {'time': max_run_time}) TimeoutError: Timeout occurred- waited 300.0 seconds.