New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 815299 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

AfeAddToProdTask failing from some new servers

Reported by jrbarnette@chromium.org, Feb 23 2018

Issue description

This bug is forked from bug 810588, starting at comment #20.

I ran this command:
    $ bin/run_server_task AfeAddToProdTask --prod_master cautotest --host_server cros-full-0035.mtv.corp.google.com

The command failed when calling a routine named `_update_cloudsql_server_whitelist()`.

Looking at the general function of the server, RPC calls to
`get_jobs_summary()` via that AFE fail when trying to read
from the TKO.  The same failure is also observed from
cros-full-0037.mtv.

I suspect the "add to prod" failure is causing the "get_jobs_summary"
failures.  So, let's fix whatever is wrong with Cloud SQL.

Here's the traceback from the AfeAddToProdTask failure:
2018-02-23 06:41:39,708 ERRO| Traceback (most recent call last):
  File "/usr/local/google/home/jrbarnette/repos/cros.base/chromeos-admin/venv/server_management_lib/tasks/task.py", line 66, in run
    self._run()
  File "/usr/local/google/home/jrbarnette/repos/cros.base/chromeos-admin/venv/server_management_lib/tasks/atomic_afe.py", line 42, in _run
    self._update_cloudsql_server_whitelist()
  File "/usr/local/google/home/jrbarnette/repos/cros.base/chromeos-admin/venv/server_management_lib/tasks/atomic_common.py", line 113, in decorated_func
    return func(self)
  File "/usr/local/google/home/jrbarnette/repos/cros.base/chromeos-admin/venv/server_management_lib/tasks/atomic_common.py", line 441, in _update_cloudsql_server_whitelist
    api.local(command, capture=True)
  File "/usr/local/google/home/jrbarnette/.cache/cros_venv/venv-2.7.13-95de6b4f9b30bb6fc148ee4eccd758dc/local/lib/python2.7/site-packages/fabric/operations.py", line 1237, in local
    error(message=msg, stdout=out, stderr=err)
  File "/usr/local/google/home/jrbarnette/.cache/cros_venv/venv-2.7.13-95de6b4f9b30bb6fc148ee4eccd758dc/local/lib/python2.7/site-packages/fabric/utils.py", line 358, in error
    return func(message)
  File "/usr/local/google/home/jrbarnette/.cache/cros_venv/venv-2.7.13-95de6b4f9b30bb6fc148ee4eccd758dc/local/lib/python2.7/site-packages/fabric/utils.py", line 54, in abort
    raise env.abort_exception(msg)
FabricException: local() encountered an error (return code 1) while executing '/usr/local/autotest//site_utils/sync_cloudsql_access.py --project google.com:chromeos-lab --instance tko --afe cautotest --extra_servers 172.25.66.97'

2018-02-23 06:41:39,708 ERRO| local() encountered an error (return code 1) while executing '/usr/local/autotest//site_utils/sync_cloudsql_access.py --project google.com:chromeos-lab --instance tko --afe cautotest --extra_servers 172.25.66.97'

 
I've confirmed that re-running AfeAddToProdTask against cros-full-0037
produces the same failure in `_update_cloudsql_server_whitelist()`.

That step is to whitelist the ip address of the new server in tko cloudsql. Did you run 'gcloud auth login' before you run this task?
It is explicitly mentioned in the instruction
The instructions I followed are here:
    https://sites.google.com/a/google.com/chromeos/for-team-members/infrastructure/chromeos-admin/basic-servertype-management

That doesn't mention `gcloud auth login`.

Sorry, this is required for adding any server to the prod. I will update the instruction I followed.
"you followed" I mean
I've run `gcloud auth login`, and re-run the AfeAddToProdTask command.
It continues to fail as before.

chromeos-test@cros-full-0034:/usr/local/autotest$ /usr/local/autotest//site_utils/sync_cloudsql_access.py --project google.com:chromeos-lab --instance tko --afe cautotest --extra_servers 172.25.66.97
Adding servers ['chromeos-server151.cbf.corp.google.com', 'chromeos-server158.cbf.corp.google.com', 'chromeos-server156.cbf.corp.google.com', 'chromeos-server125.hot.corp.google.com', 'chromeos-server134.hot.corp.google.com', 'chromeos-server126.hot.corp.google.com', 'chromeos-server155.cbf.corp.google.com', 'chromeos-server129.hot.corp.google.com', 'chromeos-skunk-2.mtv.corp.google.com', 'chromeos-skunk-3.mtv.corp.google.com', 'chromeos-skunk-4.mtv.corp.google.com', 'chromeos-skunk-5.mtv.corp.google.com', 'chromeos-skunk-1.mtv.corp.google.com', 'chromeos-server-tester1.mtv.corp.google.com', 'chromeos-server-tester2.mtv.corp.google.com', 'cros-full-0001.mtv.corp.google.com', 'cros-full-0002.mtv.corp.google.com', 'cros-full-0003.mtv.corp.google.com', 'cros-full-0004.mtv.corp.google.com', 'cros-full-0005.mtv.corp.google.com', 'cros-full-0006.mtv.corp.google.com', 'cros-full-0007.mtv.corp.google.com', 'cros-full-0008.mtv.corp.google.com', 'cros-full-0009.mtv.corp.google.com', 'cros-full-0010.mtv.corp.google.com', 'cros-full-0011.mtv.corp.google.com', 'cros-full-0012.mtv.corp.google.com', 'cros-full-0013.mtv.corp.google.com', 'cros-full-0014.mtv.corp.google.com', 'cros-full-0015.mtv.corp.google.com', 'cros-full-0016.mtv.corp.google.com', 'cros-full-0017.mtv.corp.google.com', 'cros-full-0018.mtv.corp.google.com', 'cros-full-0019.mtv.corp.google.com', 'cros-full-0020.mtv.corp.google.com', 'cros-full-0021.mtv.corp.google.com', 'cros-full-0022.mtv.corp.google.com', 'cros-full-0023.mtv.corp.google.com', 'cros-full-0024.mtv.corp.google.com', 'cros-full-0025.mtv.corp.google.com', 'cros-full-0030.mtv.corp.google.com', 'cros-full-0027.mtv.corp.google.com', 'cros-full-0029.mtv.corp.google.com', 'cros-full-0028.mtv.corp.google.com', 'cros-full-0026.mtv.corp.google.com', 'cros-full-0033.mtv.corp.google.com', 'cros-full-0032.mtv.corp.google.com', 'cros-full-0031.mtv.corp.google.com', 'cros-full-0036.mtv.corp.google.com', 'cros-full-0034.mtv.corp.google.com', 'cros-bighd-0001.mtv.corp.google.com', 'cros-full-0035.mtv.corp.google.com', 'cros-full-0037.mtv.corp.google.com', '172.25.66.97'] to access list for projects tko
Fetching their IP addresses...
...Done: ['100.108.133.186', '100.108.133.206', '100.108.133.205', '100.109.178.143', '100.109.175.140', '100.109.178.145', '100.108.133.203', '100.109.178.146', '100.116.60.160', '100.116.60.161', '100.116.60.162', '100.116.60.163', '100.116.60.159', '100.109.25.87', '100.109.25.88', '100.109.25.130', '100.109.25.142', '100.109.25.145', '100.109.25.143', '100.109.25.132', '100.109.25.147', '100.109.25.144', '100.109.25.134', '100.109.25.139', '100.109.25.133', '100.109.25.135', '100.109.25.140', '100.109.25.138', '100.109.25.131', '100.109.25.141', '100.109.25.149', '100.109.25.137', '100.109.25.146', '100.109.25.148', '100.109.25.150', '100.108.189.2', '100.108.189.4', '100.108.189.3', '100.108.189.5', '100.108.189.6', '100.108.189.49', '100.108.189.41', '100.108.189.48', '100.108.189.47', '100.108.189.40', '100.108.189.54', '100.108.189.51', '100.108.189.50', '100.108.189.42', '100.108.189.52', '100.108.189.33', '100.108.189.53', '100.108.189.43', '172.25.66.97']
DEBUG:root:Running 'gcloud config set project google.com:chromeos-lab -q'
DEBUG:root:Running 'gcloud auth login'
/bin/bash: gcloud: command not found
Traceback (most recent call last):
  File "/usr/local/autotest//site_utils/sync_cloudsql_access.py", line 135, in <module>
    main()
  File "/usr/local/autotest//site_utils/sync_cloudsql_access.py", line 131, in main
    options.extra_servers)
  File "/usr/local/autotest//site_utils/sync_cloudsql_access.py", line 110, in update_allowed_networks
    gcloud_login(project)
  File "/usr/local/autotest//site_utils/sync_cloudsql_access.py", line 57, in gcloud_login
    stderr_tee=sys.stderr, stdin=sys.stdin)
  File "/usr/local/autotest/client/common_lib/utils.py", line 748, in run
    "Command returned non-zero exit status")
autotest_lib.client.common_lib.error.CmdError: Command <gcloud auth login> failed, rc=127, Command returned non-zero exit status
* Command: 
    gcloud auth login
Exit status: 127
Duration: 0.000725984573364

stderr:
/bin/bash: gcloud: command not found


gcloud is not installed on cautotest
Cc: xixuan@chromium.org ayatane@chromium.org
> gcloud is not installed on cautotest

That might explain this problem overall.

However, I'm not fully convinced:  cautotest isn't suffering from
this symptom.  So far, I see only cros-full-0035 and cros-full-0037
being affected.

So, if the problem is "gcloud isn't installed" we still need to
explain why some servers are affected, and others aren't.

ATM, my leading theory is that I don't have some necessary permission,
since the work to add cautotest (cros-full-0034) was done by akeshet@.

> gcloud is not installed on cautotest

One other potential point of interest:  The failures I'm
observing are from running commands on my workstation, not
on cautotest.

Sorry, that line is ran locally. I just ran this command without any issue:

shuqianz@charlenez ~/c/chromeos-admin> ~/chromiumos/src/third_party/autotest/files/site_utils/sync_cloudsql_access.py --project google.com:chromeos-lab --instance tko --afe cautotest --extra_servers 172.25.66.97
Adding servers ['chromeos-server151.cbf.corp.google.com', 'chromeos-server158.cbf.corp.google.com', 'chromeos-server156.cbf.corp.google.com', 'chromeos-server125.hot.corp.google.com', 'chromeos-server134.hot.corp.google.com', 'chromeos-server126.hot.corp.google.com', 'chromeos-server155.cbf.corp.google.com', 'chromeos-server129.hot.corp.google.com', 'chromeos-skunk-2.mtv.corp.google.com', 'chromeos-skunk-3.mtv.corp.google.com', 'chromeos-skunk-4.mtv.corp.google.com', 'chromeos-skunk-5.mtv.corp.google.com', 'chromeos-skunk-1.mtv.corp.google.com', 'chromeos-server-tester1.mtv.corp.google.com', 'chromeos-server-tester2.mtv.corp.google.com', 'cros-full-0001.mtv.corp.google.com', 'cros-full-0002.mtv.corp.google.com', 'cros-full-0003.mtv.corp.google.com', 'cros-full-0004.mtv.corp.google.com', 'cros-full-0005.mtv.corp.google.com', 'cros-full-0006.mtv.corp.google.com', 'cros-full-0007.mtv.corp.google.com', 'cros-full-0008.mtv.corp.google.com', 'cros-full-0009.mtv.corp.google.com', 'cros-full-0010.mtv.corp.google.com', 'cros-full-0011.mtv.corp.google.com', 'cros-full-0012.mtv.corp.google.com', 'cros-full-0013.mtv.corp.google.com', 'cros-full-0014.mtv.corp.google.com', 'cros-full-0015.mtv.corp.google.com', 'cros-full-0016.mtv.corp.google.com', 'cros-full-0017.mtv.corp.google.com', 'cros-full-0018.mtv.corp.google.com', 'cros-full-0019.mtv.corp.google.com', 'cros-full-0020.mtv.corp.google.com', 'cros-full-0021.mtv.corp.google.com', 'cros-full-0022.mtv.corp.google.com', 'cros-full-0023.mtv.corp.google.com', 'cros-full-0024.mtv.corp.google.com', 'cros-full-0025.mtv.corp.google.com', 'cros-full-0030.mtv.corp.google.com', 'cros-full-0027.mtv.corp.google.com', 'cros-full-0029.mtv.corp.google.com', 'cros-full-0028.mtv.corp.google.com', 'cros-full-0026.mtv.corp.google.com', 'cros-full-0033.mtv.corp.google.com', 'cros-full-0032.mtv.corp.google.com', 'cros-full-0031.mtv.corp.google.com', 'cros-full-0036.mtv.corp.google.com', 'cros-full-0034.mtv.corp.google.com', 'cros-bighd-0001.mtv.corp.google.com', 'cros-full-0035.mtv.corp.google.com', 'cros-full-0037.mtv.corp.google.com', '172.25.66.97'] to access list for projects tko
Fetching their IP addresses...
...Done: ['100.108.133.186', '100.108.133.206', '100.108.133.205', '100.109.178.143', '100.109.175.140', '100.109.178.145', '100.108.133.203', '100.109.178.146', '100.116.60.160', '100.116.60.161', '100.116.60.162', '100.116.60.163', '100.116.60.159', '100.109.25.87', '100.109.25.88', '100.109.25.130', '100.109.25.142', '100.109.25.145', '100.109.25.143', '100.109.25.132', '100.109.25.147', '100.109.25.144', '100.109.25.134', '100.109.25.139', '100.109.25.133', '100.109.25.135', '100.109.25.140', '100.109.25.138', '100.109.25.131', '100.109.25.141', '100.109.25.149', '100.109.25.137', '100.109.25.146', '100.109.25.148', '100.109.25.150', '100.108.189.2', '100.108.189.4', '100.108.189.3', '100.108.189.5', '100.108.189.6', '100.108.189.49', '100.108.189.41', '100.108.189.48', '100.108.189.47', '100.108.189.40', '100.108.189.54', '100.108.189.51', '100.108.189.50', '100.108.189.42', '100.108.189.52', '100.108.189.33', '100.108.189.53', '100.108.189.43', '172.25.66.97']
DEBUG:root:Running 'gcloud config set project google.com:chromeos-lab -q'
Running command to update whitelists: "gcloud sql instances patch tko --authorized-networks 100.108.133.186/32,100.108.133.206/32,100.108.133.205/32,100.109.178.143/32,100.109.175.140/32,100.109.178.145/32,100.108.133.203/32,100.109.178.146/32,100.116.60.160/32,100.116.60.161/32,100.116.60.162/32,100.116.60.163/32,100.116.60.159/32,100.109.25.87/32,100.109.25.88/32,100.109.25.130/32,100.109.25.142/32,100.109.25.145/32,100.109.25.143/32,100.109.25.132/32,100.109.25.147/32,100.109.25.144/32,100.109.25.134/32,100.109.25.139/32,100.109.25.133/32,100.109.25.135/32,100.109.25.140/32,100.109.25.138/32,100.109.25.131/32,100.109.25.141/32,100.109.25.149/32,100.109.25.137/32,100.109.25.146/32,100.109.25.148/32,100.109.25.150/32,100.108.189.2/32,100.108.189.4/32,100.108.189.3/32,100.108.189.5/32,100.108.189.6/32,100.108.189.49/32,100.108.189.41/32,100.108.189.48/32,100.108.189.47/32,100.108.189.40/32,100.108.189.54/32,100.108.189.51/32,100.108.189.50/32,100.108.189.42/32,100.108.189.52/32,100.108.189.33/32,100.108.189.53/32,100.108.189.43/32,172.25.66.97/32 -q"
DEBUG:root:Running 'gcloud sql instances patch tko --authorized-networks 100.108.133.186/32,100.108.133.206/32,100.108.133.205/32,100.109.178.143/32,100.109.175.140/32,100.109.178.145/32,100.108.133.203/32,100.109.178.146/32,100.116.60.160/32,100.116.60.161/32,100.116.60.162/32,100.116.60.163/32,100.116.60.159/32,100.109.25.87/32,100.109.25.88/32,100.109.25.130/32,100.109.25.142/32,100.109.25.145/32,100.109.25.143/32,100.109.25.132/32,100.109.25.147/32,100.109.25.144/32,100.109.25.134/32,100.109.25.139/32,100.109.25.133/32,100.109.25.135/32,100.109.25.140/32,100.109.25.138/32,100.109.25.131/32,100.109.25.141/32,100.109.25.149/32,100.109.25.137/32,100.109.25.146/32,100.109.25.148/32,100.109.25.150/32,100.108.189.2/32,100.108.189.4/32,100.108.189.3/32,100.108.189.5/32,100.108.189.6/32,100.108.189.49/32,100.108.189.41/32,100.108.189.48/32,100.108.189.47/32,100.108.189.40/32,100.108.189.54/32,100.108.189.51/32,100.108.189.50/32,100.108.189.42/32,100.108.189.52/32,100.108.189.33/32,100.108.189.53/32,100.108.189.43/32,172.25.66.97/32 -q'
The following message will be used for the patch API method.
{"project": "google.com:chromeos-lab", "name": "tko", "settings": {"ipConfiguration": {"authorizedNetworks": [{"value": "100.108.133.186/32"}, {"value": "100.108.133.206/32"}, {"value": "100.108.133.205/32"}, {"value": "100.109.178.143/32"}, {"value": "100.109.175.140/32"}, {"value": "100.109.178.145/32"}, {"value": "100.108.133.203/32"}, {"value": "100.109.178.146/32"}, {"value": "100.116.60.160/32"}, {"value": "100.116.60.161/32"}, {"value": "100.116.60.162/32"}, {"value": "100.116.60.163/32"}, {"value": "100.116.60.159/32"}, {"value": "100.109.25.87/32"}, {"value": "100.109.25.88/32"}, {"value": "100.109.25.130/32"}, {"value": "100.109.25.142/32"}, {"value": "100.109.25.145/32"}, {"value": "100.109.25.143/32"}, {"value": "100.109.25.132/32"}, {"value": "100.109.25.147/32"}, {"value": "100.109.25.144/32"}, {"value": "100.109.25.134/32"}, {"value": "100.109.25.139/32"}, {"value": "100.109.25.133/32"}, {"value": "100.109.25.135/32"}, {"value": "100.109.25.140/32"}, {"value": "100.109.25.138/32"}, {"value": "100.109.25.131/32"}, {"value": "100.109.25.141/32"}, {"value": "100.109.25.149/32"}, {"value": "100.109.25.137/32"}, {"value": "100.109.25.146/32"}, {"value": "100.109.25.148/32"}, {"value": "100.109.25.150/32"}, {"value": "100.108.189.2/32"}, {"value": "100.108.189.4/32"}, {"value": "100.108.189.3/32"}, {"value": "100.108.189.5/32"}, {"value": "100.108.189.6/32"}, {"value": "100.108.189.49/32"}, {"value": "100.108.189.41/32"}, {"value": "100.108.189.48/32"}, {"value": "100.108.189.47/32"}, {"value": "100.108.189.40/32"}, {"value": "100.108.189.54/32"}, {"value": "100.108.189.51/32"}, {"value": "100.108.189.50/32"}, {"value": "100.108.189.42/32"}, {"value": "100.108.189.52/32"}, {"value": "100.108.189.33/32"}, {"value": "100.108.189.53/32"}, {"value": "100.108.189.43/32"}, {"value": "172.25.66.97/32"}]}, "databaseFlags": [{"name": "innodb_file_per_table", "value": "on"}, {"name": "slow_query_log", "value": "on"}, {"name": "long_query_time", "value": "90"}, {"name": "wait_timeout", "value": "300"}]}}
Patching Cloud SQL instance...
.done.
Updated [https://www.googleapis.com/sql/v1beta4/projects/google.com%3Achromeos-lab/instances/tko].
No backup False

So, the problem is your setup. 

Sign in to add a comment