rpc_flight_recorder is crashlooping | No module named autotest_lib.site_utils |
|||||||||
Issue descriptionhttps://viceroy.corp.google.com/chromeos/afe_rpc?duration=1d shows no metrics from blackbox monitor chromeos-test@chromeos-server156:/var/log/rpc_flight_recorder$ cat rpc_flight_recorder.log Traceback (most recent call last): File "/usr/local/autotest/site_utils/rpc_flight_recorder.py", line 19, in <module> from autotest_lib.site_utils import server_manager_utils File "/usr/local/autotest/site_utils/server_manager_utils.py", line 18, in <module> import django.core.exceptions ImportError: No module named django.core.exceptions Is this related to recent migration? Was sentinel server migrated at this time? I see two sentinel entries in serverdb, one which claims it is "repair required".
,
Sep 22 2017
No, don't think this is due to sentinel migration, as it's been down for a long time (judging by the graph).
,
Sep 22 2017
Looks like we don't run rpc_flight_recorder inside of virtualenv. Maybe we should?
~/chromiumos/chromeos-admin/puppet/modules/rpc_flight_recorder/templates$ cat rpc_flight_recorder.conf.erb
# Copyright 2017 The Chromium OS Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
description "RPC Flight Recorder daemon"
start on runlevel [2345]
stop on runlevel [!2345]
respawn
respawn limit unlimited
post-stop exec sleep 5
env LOGDIR=/var/log/rpc_flight_recorder
env CMD=<%= @autotest_dir %>/site_utils/rpc_flight_recorder.py
env ARGS=
env USER=<%= @chromeos_test_user %>
script
OUTPUT=${LOGDIR}/${UPSTART_JOB}.log
# Save the last 5 copies of ${OUTPUT}, numbered in order
# from most to least recent.
PREV=.5
for SUFFIX in .4 .3 .2 .1 ''; do
mv -f ${OUTPUT}${SUFFIX} ${OUTPUT}${PREV} || :
PREV=${SUFFIX}
done
exec sudo -u ${USER} ${CMD} ${ARGS} >>${OUTPUT} 2>&1
end script
,
Sep 22 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/2fa7237d117904f32a212cd89dd229e3e71eede8 commit 2fa7237d117904f32a212cd89dd229e3e71eede8 Author: Aviv Keshet <akeshet@chromium.org> Date: Fri Sep 22 17:42:42 2017 autotest: add a venv wrapper for rpc_flight_recorder BUG= chromium:767685 TEST=bin/rpc_flight_recorder Change-Id: I9c132d2746cee816d6ff2928914f33abc1ad1696 Reviewed-on: https://chromium-review.googlesource.com/678074 Tested-by: Aviv Keshet <akeshet@chromium.org> Trybot-Ready: Aviv Keshet <akeshet@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [add] https://crrev.com/2fa7237d117904f32a212cd89dd229e3e71eede8/bin/rpc_flight_recorder
,
Sep 22 2017
,
Sep 22 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/9e773ef5e841a30ea0758275887174d716dc0634 commit 9e773ef5e841a30ea0758275887174d716dc0634 Author: Aviv Keshet <akeshet@chromium.org> Date: Fri Sep 22 19:15:21 2017
,
Sep 27 2017
On sentinel server: chromeos-test@chromeos-server156:/var/log/rpc_flight_recorder$ tail -f rpc_flight_recorder.log /usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/bin/python: No module named autotest_lib.site_utils However, locally: akeshet@akeshet:~/chromiumos/src/third_party/autotest/files$ ./bin/rpc_flight_recorder INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): metadata.google.internal INFO:root:Configuration file does not exist, ignoring: /etc/chrome-infra/ts-mon.json ERROR:root:ts_mon monitoring is disabled because the endpoint provided is invalid or not supported: NOTICE:root:ts_mon was set up. What gives, why the inconsistent behavior?
,
Sep 27 2017
I think it's because the autotest_lib is linked to: lrwxrwxrwx 1 xixuan eng 11 Sep 6 11:28 autotest_lib -> ../../files which works on your local workstation if you use path **/autotest/files/bin/rpc_flight_recorder, but on server you will call '/usr/local/autotest/bin/rpc_flight_recorder', no '../../files' folder anymore, so the symlink won't work.
,
Sep 27 2017
Issue 769135 has been merged into this issue.
,
Sep 27 2017
,
Sep 27 2017
Verified that just updating the symlink to be ../ works.
But then I see this trace:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/autotest/site_utils/rpc_flight_recorder.py", line 19, in <module>
from autotest_lib.site_utils import server_manager_utils
File "/usr/local/autotest/venv/autotest_lib/site_utils/server_manager_utils.py", line 21, in <module>
from autotest_lib.frontend.server import models as server_models
File "/usr/local/autotest/venv/autotest_lib/frontend/server/models.py", line 8, in <module>
from django.db import models as dbmodels
File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/__init__.py", line 40, in <module>
backend = load_backend(connection.settings_dict['ENGINE'])
File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/__init__.py", line 34, in __getattr__
return getattr(connections[DEFAULT_DB_ALIAS], item)
File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/utils.py", line 93, in __getitem__
backend = load_backend(db['ENGINE'])
File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/utils.py", line 27, in load_backend
return import_module('.base', backend_name)
File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/utils/importlib.py", line 35, in import_module
__import__(name)
File "/usr/local/autotest/venv/autotest_lib/frontend/db/backends/afe/base.py", line 1, in <module>
from django.db.backends.mysql.base import DatabaseCreation as MySQLCreation
File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 17, in <module>
raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: No module named MySQLdb
This is because the new sentinel server doesn't have /usr/local/autotest/site-packages.
This means that build_externals wasn't run on it, and isn't being run via puppet for the sentinel role. Someone must have manually kicked it on the old server.
,
Sep 27 2017
https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/687735 is the next step, but will need to be pushed-to-prod.
,
Sep 27 2017
My reading of the sentinel puppet role says that we _should_ have run build_externals when we setup this server because it includes autotest which runs setup_dev_autotest But, we all know appearances can be misguiding, which brings me back to the reason we're here. We're not here because setup_dev_autotest works, we're here because it does not work.
,
Sep 27 2017
Nothing RVG here. https://chrome-internal-review.googlesource.com/#/c/chromeos/chromeos-admin/+/464912 to at least run build_externals when do a push. I've run it manually on the server now to unbreak us.
,
Sep 27 2017
,
Sep 29 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/fe5b276bb77423362042610648dc6bfb6b476931 commit fe5b276bb77423362042610648dc6bfb6b476931 Author: Prathmesh Prabhu <pprabhu@chromium.org> Date: Fri Sep 29 21:15:58 2017
,
Sep 30 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/9a80152074eb708004ba10eafe64908b566339fb commit 9a80152074eb708004ba10eafe64908b566339fb Author: Prathmesh Prabhu <pprabhu@chromium.org> Date: Sat Sep 30 03:16:21 2017 [autotest] Update venv symlink to autotest_lib There is no need to assume any directory name for the root autotest directory, and this assumption doesn't work uniformly across our deployments. BUG= chromium:767685 TEST='bin/python_venv' Change-Id: Iba3c83225a0c766d5036e366d0d9d2b34c2a09a4 Reviewed-on: https://chromium-review.googlesource.com/687735 Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org> Tested-by: Prathmesh Prabhu <pprabhu@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [modify] https://crrev.com/9a80152074eb708004ba10eafe64908b566339fb/venv/autotest_lib
,
Oct 2 2017
Fix is pending push-to-prod.
,
Oct 2 2017
After trying to push to chromeos-server156.cbf:
From /var/log/rpc_flight_recorder/rpc_flight_recorder.log
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/autotest/site_utils/rpc_flight_recorder.py", line 19, in <module>
from autotest_lib.site_utils import server_manager_utils
File "/usr/local/autotest/venv/autotest_lib/site_utils/server_manager_utils.py", line 21, in <module>
from autotest_lib.frontend.server import models as server_models
File "/usr/local/autotest/venv/autotest_lib/frontend/server/models.py", line 8, in <module>
from django.db import models as dbmodels
File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/__init__.py", line 40, in <module>
backend = load_backend(connection.settings_dict['ENGINE'])
File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/__init__.py", line 34, in __getattr__
return getattr(connections[DEFAULT_DB_ALIAS], item)
File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/utils.py", line 93, in __getitem__
backend = load_backend(db['ENGINE'])
File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/utils.py", line 27, in load_backend
return import_module('.base', backend_name)
File "/usr/local/autotest/venv/autotest_lib/site-packages/django/utils/importlib.py", line 35, in import_module
__import__(name)
File "/usr/local/autotest/venv/autotest_lib/frontend/db/backends/afe/base.py", line 1, in <module>
from django.db.backends.mysql.base import DatabaseCreation as MySQLCreation
File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/backends/mysql/base.py", line 17, in <module>
raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: No module named MySQLdb
I don't think the push will fix this.
,
Oct 2 2017
,
Oct 3 2017
,
Oct 3 2017
I manually ran build_externals on that server. Maybe it wasn't run automatically?
,
Oct 3 2017
The push runs it on every server. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by akes...@chromium.org
, Sep 22 2017