New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 723757 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: May 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

autotest_error_log_metricsgs_offloader is unrecognized service on new server

Project Member Reported by nxia@chromium.org, May 17 2017

Issue description

 bin/run_server_task  ShardAddToProdTask --prod_master cautotest --host_server chromeos-skunk1.mtv.corp.google.com --board_labels "dummy_board"




Fatal error: run() received nonzero return code 1 while executing!

Requested: /usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d
Executed: /bin/bash -l -c "/usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d"

Aborting.
2017-05-17 11:11:29,519 ERRO| Traceback (most recent call last):
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_common.py", line 126, in decorated_func
    func(self)
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_shard.py", line 290, in run
    self._add_shard_to_autotestdb()
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_common.py", line 98, in decorated_func
    return func(self)
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_shard.py", line 271, in _add_shard_to_autotestdb
    api.run(command)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/network.py", line 677, in host_prompting_wrapper
    return func(*args, **kwargs)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/operations.py", line 1088, in run
    shell_escape=shell_escape, capture_buffer_size=capture_buffer_size,
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/operations.py", line 952, in _run_command
    error(message=msg, stdout=out, stderr=err)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/utils.py", line 358, in error
    return func(message)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/utils.py", line 54, in abort
    raise env.abort_exception(msg)
FabricException: run() received nonzero return code 1 while executing!

Requested: /usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d
Executed: /bin/bash -l -c "/usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d"

2017-05-17 11:11:29,520 ERRO| run() received nonzero return code 1 while executing!

Requested: /usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d
Executed: /bin/bash -l -c "/usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d"
2017-05-17 11:11:29,520 INFO| Printing out task report.
{
  "sub_reports": [],
  "exception": "FabricException('run() received nonzero return code 1 while executing!\\n\\nRequested: /usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d\\nExecuted: /bin/bash -l -c \"/usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d\"',)",
  "is_successful": false,
  "description": "run() received nonzero return code 1 while executing!\n\nRequested: /usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d\nExecuted: /bin/bash -l -c \"/usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l d,u,m,m,y,_,b,o,a,r,d\"",
  "arguments_used": {
    "board_labels": [
      "d",
      "u",
      "m",
      "m",
      "y",
      "_",
      "b",
      "o",
      "a",
      "r",
      "d"
    ],
    "host_server": "chromeos-skunk1.mtv.corp.google.com",
    "prod_master": "cautotest"
  },
  "task_name": "ShardAddToProdTask"
}

 
Status: WontFix (was: Untriaged)
./bin/run_server_task ShardAddToProdTask -h
usage: run_server_task.py ShardAddToProdTask [-h] --prod_master PROD_MASTER
                                             --host_server HOST_SERVER
                                             --board_labels BOARD_LABELS

optional arguments:
  -h, --help            show this help message and exit
  --prod_master PROD_MASTER
  --host_server HOST_SERVER
  --board_labels BOARD_LABELS


board_labels is a list

Comment 2 by nxia@chromium.org, May 17 2017

Status: Available (was: WontFix)
bin/run_server_task  ShardAddToProdTask --prod_master cautotest --host_server chromeos-skunk1.mtv.corp.google.com --board_labels ['dummy-board']  also doesn't work
There is a fix uploaded. After that change, try this:
bin/run_server_task  ShardAddToProdTask --prod_master cautotest --host_server chromeos-skunk1.mtv.corp.google.com --board_labels 'dummy-board'
Project Member

Comment 4 by bugdroid1@chromium.org, May 17 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/bebcce22524a34a9945cd79980cfd34989e5bd4f

commit bebcce22524a34a9945cd79980cfd34989e5bd4f
Author: Shuqian Zhao <shuqianz@chromium.org>
Date: Wed May 17 19:02:49 2017

Comment 5 by nxia@chromium.org, May 17 2017

[cautotest] run: /usr/local/autotest//cli/atest label list dummy-board
2017-05-17 12:44:33,507 INFO| Connected (version 2.0, client OpenSSH_7.2)
2017-05-17 12:44:36,906 INFO| Authentication (publickey) successful!
[cautotest] out: Unknown label(s): 
[cautotest] out:         dummy-board
[cautotest] out: 

[cautotest] run: /usr/local/autotest//cli/atest label create dummy-board
[cautotest] out: Created label: 
[cautotest] out: 	'dummy-board'
[cautotest] out: 

[cautotest] run: /usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l dummy-board
[cautotest] out: Operation add_shard failed:
[cautotest] out:     RPCException: Sharding only supports `board:.*` label. (chromeos-skunk1.mtv.corp.google.com)
[cautotest] out: 


Fatal error: run() received nonzero return code 1 while executing!

Requested: /usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l dummy-board
Executed: /bin/bash -l -c "/usr/local/autotest//cli/atest shard create chromeos-skunk1.mtv.corp.google.com -l dummy-board"

Aborting.


is the board format invalid?
It is a list of board label, which is board:xxxx format

Comment 7 Deleted

Comment 8 by nxia@chromium.org, May 17 2017

do you mean change #3 to 

bin/run_server_task  ShardAddToProdTask --prod_master cautotest --host_server chromeos-skunk1.mtv.corp.google.com --board_labels 'board:dummy-board' ?
you can play with board:nxia-dummy-board
Project Member

Comment 10 by bugdroid1@chromium.org, May 17 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/4e6f38f52692d898166be925346e3010c68dd2d3

commit 4e6f38f52692d898166be925346e3010c68dd2d3
Author: Shuqian Zhao <shuqianz@chromium.org>
Date: Wed May 17 19:59:44 2017

Comment 11 by nxia@chromium.org, May 17 2017



[chromeos-skunk1.mtv.corp.google.com] sudo: service autotest_error_log_metricsgs_offloader stop
[chromeos-skunk1.mtv.corp.google.com] out: autotest_error_log_metricsgs_offloader: unrecognized service
[chromeos-skunk1.mtv.corp.google.com] out: 


Warning: sudo() received nonzero return code 1 while executing 'service autotest_error_log_metricsgs_offloader stop'!

[chromeos-skunk1.mtv.corp.google.com] sudo: service autotest_error_log_metricsgs_offloader start
[chromeos-skunk1.mtv.corp.google.com] out: autotest_error_log_metricsgs_offloader: unrecognized service
[chromeos-skunk1.mtv.corp.google.com] out: 


Fatal error: sudo() received nonzero return code 1 while executing!

Requested: service autotest_error_log_metricsgs_offloader start
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "service autotest_error_log_metricsgs_offloader start"

Aborting.
2017-05-17 12:53:19,306 ERRO| Traceback (most recent call last):
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_common.py", line 126, in decorated_func
    func(self)
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_shard.py", line 309, in run
    self.services_restart_task._restart_services_on_shard()
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_shard.py", line 166, in _restart_services_on_shard
    atomic_common.safe_restart_service(self.host_server, service)
  File "/usr/local/google/home/nxia/chromiumos/chromeos-admin/venv/server_management_lib/tasks/atomic_common.py", line 155, in safe_restart_service
    api.sudo('service %s start' % service_name)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/network.py", line 677, in host_prompting_wrapper
    return func(*args, **kwargs)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/operations.py", line 1146, in sudo
    capture_buffer_size=capture_buffer_size,
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/operations.py", line 952, in _run_command
    error(message=msg, stdout=out, stderr=err)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/utils.py", line 358, in error
    return func(message)
  File "/usr/local/google/home/nxia/.cache/cros_venv/venv-2.7.6-0592ea92663037bb624f94a82dbb4eb5/local/lib/python2.7/site-packages/fabric/utils.py", line 54, in abort
    raise env.abort_exception(msg)
FabricException: sudo() received nonzero return code 1 while executing!

Requested: service autotest_error_log_metricsgs_offloader start
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "service autotest_error_log_metricsgs_offloader start"


why is autotest_error_log_metricsgs_offloader unrecognized service? 

Owner: pho...@chromium.org
Summary: autotest_error_log_metricsgs_offloader is unrecognized service on new server (was: ShardAddToProdTask failed to add a dummy board )
Labels: -Pri-1 Pri-0
I think there's a missing comma in this cl: https://chrome-internal-review.googlesource.com/c/376949/2/venv/server_management_lib/tasks/atomic_shard.py

Making a CL to fix this now.
Project Member

Comment 15 by bugdroid1@chromium.org, May 17 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/098b3f0b1e14e25cd1e51eddbf4709b971c0b020

commit 098b3f0b1e14e25cd1e51eddbf4709b971c0b020
Author: Paul Hobbs <phobbs@google.com>
Date: Wed May 17 21:25:06 2017

Comment 16 by nxia@chromium.org, May 17 2017

Cc: pho...@chromium.org
Owner: shuqianz@chromium.org
The previous ShardAddToProdTask failed at restarting the services because of the typo bug (Paul already fixed). What cleanup tasks should I run before I run the ShardAddToProdTask again? 

now ShardAddToProdTask fails because the shard was already added, but clearly the previous add operation failed in the middle. 

Comment 17 by nxia@chromium.org, May 17 2017

Labels: -Pri-0 Pri-1
Status: Fixed (was: Available)

Sign in to add a comment