New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 767685 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 769419

Blocking:
issue 767657



Sign in to add a comment

rpc_flight_recorder is crashlooping | No module named autotest_lib.site_utils

Project Member Reported by akes...@chromium.org, Sep 22 2017

Issue description

https://viceroy.corp.google.com/chromeos/afe_rpc?duration=1d shows no metrics from blackbox monitor

chromeos-test@chromeos-server156:/var/log/rpc_flight_recorder$ cat rpc_flight_recorder.log
Traceback (most recent call last):
  File "/usr/local/autotest/site_utils/rpc_flight_recorder.py", line 19, in <module>
    from autotest_lib.site_utils import server_manager_utils
  File "/usr/local/autotest/site_utils/server_manager_utils.py", line 18, in <module>
    import django.core.exceptions
ImportError: No module named django.core.exceptions


Is this related to recent migration? Was sentinel server migrated at this time? I see two sentinel entries in serverdb, one which claims it is "repair required".
 
Labels: Restrict-View-Google
Cc: xixuan@chromium.org ayatane@chromium.org
Summary: rpc_flight_recorder is crashlooping | ImportError: No module named django.core.exceptions (was: rpc_flight_recorder is crashlooping)
No, don't think this is due to sentinel migration, as it's been down for a long time (judging by the graph).
Looks like we don't run rpc_flight_recorder inside of virtualenv. Maybe we should?

~/chromiumos/chromeos-admin/puppet/modules/rpc_flight_recorder/templates$ cat rpc_flight_recorder.conf.erb 
# Copyright 2017 The Chromium OS Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

description "RPC Flight Recorder daemon"

start on runlevel [2345]
stop on runlevel [!2345]
respawn
respawn limit unlimited
post-stop exec sleep 5

env LOGDIR=/var/log/rpc_flight_recorder
env CMD=<%= @autotest_dir %>/site_utils/rpc_flight_recorder.py
env ARGS=
env USER=<%= @chromeos_test_user %>

script
  OUTPUT=${LOGDIR}/${UPSTART_JOB}.log

  # Save the last 5 copies of ${OUTPUT}, numbered in order
  # from most to least recent.
  PREV=.5
  for SUFFIX in .4 .3 .2 .1 ''; do
    mv -f ${OUTPUT}${SUFFIX} ${OUTPUT}${PREV} || :
    PREV=${SUFFIX}
  done

  exec sudo -u ${USER} ${CMD} ${ARGS} >>${OUTPUT} 2>&1
end script

Project Member

Comment 4 by bugdroid1@chromium.org, Sep 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/2fa7237d117904f32a212cd89dd229e3e71eede8

commit 2fa7237d117904f32a212cd89dd229e3e71eede8
Author: Aviv Keshet <akeshet@chromium.org>
Date: Fri Sep 22 17:42:42 2017

autotest: add a venv wrapper for rpc_flight_recorder

BUG= chromium:767685 
TEST=bin/rpc_flight_recorder

Change-Id: I9c132d2746cee816d6ff2928914f33abc1ad1696
Reviewed-on: https://chromium-review.googlesource.com/678074
Tested-by: Aviv Keshet <akeshet@chromium.org>
Trybot-Ready: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[add] https://crrev.com/2fa7237d117904f32a212cd89dd229e3e71eede8/bin/rpc_flight_recorder

Blocking: 767657
Owner: akes...@chromium.org
Status: Started (was: Untriaged)
Project Member

Comment 6 by bugdroid1@chromium.org, Sep 22 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/9e773ef5e841a30ea0758275887174d716dc0634

commit 9e773ef5e841a30ea0758275887174d716dc0634
Author: Aviv Keshet <akeshet@chromium.org>
Date: Fri Sep 22 19:15:21 2017

Summary: rpc_flight_recorder is crashlooping | No module named autotest_lib.site_utils (was: rpc_flight_recorder is crashlooping | ImportError: No module named django.core.exceptions)
On sentinel server:

chromeos-test@chromeos-server156:/var/log/rpc_flight_recorder$ tail -f rpc_flight_recorder.log
/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/bin/python: No module named autotest_lib.site_utils

However, locally:

akeshet@akeshet:~/chromiumos/src/third_party/autotest/files$ ./bin/rpc_flight_recorder 
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): metadata.google.internal
INFO:root:Configuration file does not exist, ignoring: /etc/chrome-infra/ts-mon.json
ERROR:root:ts_mon monitoring is disabled because the endpoint provided is invalid or not supported: 
NOTICE:root:ts_mon was set up.


What gives, why the inconsistent behavior?

Comment 8 by xixuan@chromium.org, Sep 27 2017

I think it's because the autotest_lib is linked to:
lrwxrwxrwx 1 xixuan eng   11 Sep  6 11:28 autotest_lib -> ../../files

which works on your local workstation if you use path **/autotest/files/bin/rpc_flight_recorder,

but on server you will call '/usr/local/autotest/bin/rpc_flight_recorder', no '../../files' folder anymore, so the symlink won't work.
Issue 769135 has been merged into this issue.
Owner: ayatane@chromium.org
Status: Assigned (was: Started)
Owner: pprabhu@chromium.org
Status: Started (was: Assigned)
Verified that just updating the symlink to be ../ works.

But then I see this trace:
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/autotest/site_utils/rpc_flight_recorder.py", line 19, in <module>
    from autotest_lib.site_utils import server_manager_utils
  File "/usr/local/autotest/venv/autotest_lib/site_utils/server_manager_utils.py", line 21, in <module>
    from autotest_lib.frontend.server import models as server_models
  File "/usr/local/autotest/venv/autotest_lib/frontend/server/models.py", line 8, in <module>
    from django.db import models as dbmodels
  File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/__init__.py", line 40, in <module>
    backend = load_backend(connection.settings_dict['ENGINE'])
  File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/__init__.py", line 34, in __getattr__
    return getattr(connections[DEFAULT_DB_ALIAS], item)
  File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/utils.py", line 93, in __getitem__
    backend = load_backend(db['ENGINE'])
  File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/utils.py", line 27, in load_backend
    return import_module('.base', backend_name)
  File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/utils/importlib.py", line 35, in import_module
    __import__(name)
  File "/usr/local/autotest/venv/autotest_lib/frontend/db/backends/afe/base.py", line 1, in <module>
    from django.db.backends.mysql.base import DatabaseCreation as MySQLCreation
  File "/usr/local/google/home/chromeos-test/.cache/cros_venv/venv-2.7.6-4e51f0509ae723d10ebb46a6f87b0bb5/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 17, in <module>
    raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: No module named MySQLdb

This is because the new sentinel server doesn't have /usr/local/autotest/site-packages.
This means that build_externals wasn't run on it, and isn't being run via puppet for the sentinel role. Someone must have manually kicked it on the old server.
https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/687735 is the next step, but will need to be pushed-to-prod.
My reading of the sentinel puppet role says that we _should_ have run build_externals when we setup this server because it includes autotest which runs setup_dev_autotest

But, we all know appearances can be misguiding, which brings me back to the reason we're here.
We're not here because setup_dev_autotest works, we're here because it does not work.
Labels: -Restrict-View-Google
Nothing RVG here.

https://chrome-internal-review.googlesource.com/#/c/chromeos/chromeos-admin/+/464912
to at least run build_externals when do a push.

I've run it manually on the server now to unbreak us.
Blockedon: 769419
I take that back, build_externals is currently broken.
Project Member

Comment 16 by bugdroid1@chromium.org, Sep 29 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/fe5b276bb77423362042610648dc6bfb6b476931

commit fe5b276bb77423362042610648dc6bfb6b476931
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Fri Sep 29 21:15:58 2017

Project Member

Comment 17 by bugdroid1@chromium.org, Sep 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/9a80152074eb708004ba10eafe64908b566339fb

commit 9a80152074eb708004ba10eafe64908b566339fb
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Sat Sep 30 03:16:21 2017

[autotest] Update venv symlink to autotest_lib

There is no need to assume any directory name for the root autotest
directory, and this assumption doesn't work uniformly across our
deployments.

BUG= chromium:767685 
TEST='bin/python_venv'

Change-Id: Iba3c83225a0c766d5036e366d0d9d2b34c2a09a4
Reviewed-on: https://chromium-review.googlesource.com/687735
Commit-Ready: Prathmesh Prabhu <pprabhu@chromium.org>
Tested-by: Prathmesh Prabhu <pprabhu@chromium.org>
Reviewed-by: Allen Li <ayatane@chromium.org>

[modify] https://crrev.com/9a80152074eb708004ba10eafe64908b566339fb/venv/autotest_lib

Cc: dgarr...@chromium.org
Fix is pending push-to-prod.
After trying to push to chromeos-server156.cbf:

From /var/log/rpc_flight_recorder/rpc_flight_recorder.log


Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/autotest/site_utils/rpc_flight_recorder.py", line 19, in <module>
    from autotest_lib.site_utils import server_manager_utils
  File "/usr/local/autotest/venv/autotest_lib/site_utils/server_manager_utils.py", line 21, in <module>
    from autotest_lib.frontend.server import models as server_models
  File "/usr/local/autotest/venv/autotest_lib/frontend/server/models.py", line 8, in <module>
    from django.db import models as dbmodels
  File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/__init__.py", line 40, in <module>
    backend = load_backend(connection.settings_dict['ENGINE'])
  File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/__init__.py", line 34, in __getattr__
    return getattr(connections[DEFAULT_DB_ALIAS], item)
  File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/utils.py", line 93, in __getitem__
    backend = load_backend(db['ENGINE'])
  File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/utils.py", line 27, in load_backend
    return import_module('.base', backend_name)
  File "/usr/local/autotest/venv/autotest_lib/site-packages/django/utils/importlib.py", line 35, in import_module
    __import__(name)
  File "/usr/local/autotest/venv/autotest_lib/frontend/db/backends/afe/base.py", line 1, in <module>
    from django.db.backends.mysql.base import DatabaseCreation as MySQLCreation
  File "/usr/local/autotest/venv/autotest_lib/site-packages/django/db/backends/mysql/base.py", line 17, in <module>
    raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: No module named MySQLdb


I don't think the push will fix this.
Cc: pho...@chromium.org chingcodes@chromium.org
 Issue 770965  has been merged into this issue.
Status: Fixed (was: Started)
I manually ran build_externals on that server. Maybe it wasn't run automatically?
The push runs it on every server.

Sign in to add a comment