New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 767674 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

run_suite erroneously thought that there were insufficient DUTs

Project Member Reported by akes...@chromium.org, Sep 22 2017

Issue description


https://luci-milo.appspot.com/buildbot/chromeos/kevin-paladin/2481

15:34:01: WARNING: Exception is not retriable return code: 3; command: /b/c/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmpvNriU3/tmpWkNi7B/temp_summary.json --raw-cmd --task-name kevin-paladin/R63-9962.0.0-rc3-bvt-inline --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 9000 --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=priority:CQ' '--tags=suite:bvt-inline' '--tags=build:kevin-paladin/R63-9962.0.0-rc3' '--tags=task_name:kevin-paladin/R63-9962.0.0-rc3-bvt-inline' '--tags=board:kevin' -- /usr/local/autotest/site_utils/run_suite.py --build kevin-paladin/R63-9962.0.0-rc3 --board kevin --suite_name bvt-inline --pool cq --num 6 --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 5 --minimum_duts 4 --offload_failures_only True --job_keyvals "{'cidb_build_stage_id': 56738354L, 'cidb_build_id': 1872228, 'datastore_parent_key': ('Build', 1872228, 'BuildStage', 56738354L)}" -c
Priority was reset to 100
Triggered task: kevin-paladin/R63-9962.0.0-rc3-bvt-inline
chromeos-golo-server2-90: 38bf65062c300e10 3
  Autotest instance created: cautotest-prod
  TestLabException: Not enough DUTs for board: kevin, pool: cq; required: 4, found: 0
  Traceback (most recent call last):
    File "/usr/local/autotest/site_utils/run_suite.py", line 1986, in _run_task
      return _run_suite(options)
    File "/usr/local/autotest/site_utils/run_suite.py", line 1731, in _run_suite
      options.skip_duts_check)
    File "/usr/local/autotest/site_utils/diagnosis_utils.py", line 336, in check_dut_availability
      hosts=hosts)
  NotEnoughDutsError: Not enough DUTs for board: kevin, pool: cq; required: 4, found: 0
  Will return from run_suite with status: INFRA_FAILURE


However, when I check dut-status, I see a bunch of totally happy looking kevin DUTs.

$ dut-status -o -b kevin -p cq
hostname                       S   last checked         URL
chromeos2-row6-rack5-host2     OK  2017-09-21 13:32:11  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row6-rack5-host2/1801440-reset/
chromeos2-row6-rack5-host6     OK  2017-09-21 13:32:11  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row6-rack5-host6/1801436-reset/
chromeos2-row6-rack5-host7     OK  2017-09-21 13:32:11  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row6-rack5-host7/1801438-reset/
chromeos2-row6-rack3-host9     OK  2017-09-21 13:32:02  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row6-rack3-host9/1801434-reset/
chromeos2-row6-rack3-host8     OK  2017-09-21 13:30:59  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row6-rack3-host8/1801395-reset/
chromeos2-row8-rack9-host15    OK  2017-09-21 13:32:02  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack9-host15/1801433-reset/
chromeos2-row8-rack9-host17    OK  2017-09-21 13:07:58  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack9-host17/1801103-provision/
chromeos2-row8-rack9-host21    OK  2017-09-21 13:32:11  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack9-host21/1801437-reset/
chromeos2-row8-rack8-host2     OK  2017-09-21 13:32:02  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack8-host2/1801435-reset/
chromeos2-row6-rack5-host17    OK  2017-09-21 13:33:36  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row6-rack5-host17/1801471-provision/

 
Cc: xixuan@chromium.org

Comment 2 by xixuan@chromium.org, Sep 22 2017

Weird logging:
 Autotest instance created: cautotest-prod

What's cautotest-prod? It should be cautotest.

Comment 3 by xixuan@chromium.org, Sep 22 2017

Cc: pprabhu@chromium.org shuqianz@chromium.org

Comment 4 by xixuan@chromium.org, Sep 22 2017

Owner: pprabhu@chromium.org
oh, it's imported here: https://chrome-internal-review.googlesource.com/c/chromeos/chromeos-admin/+/456212

Comment 5 by xixuan@chromium.org, Sep 22 2017

Owner: ----
hmm, Ok, it's not the cname issue. These devices are locked by @duenasa at 14:39:34.

Comment 6 by xixuan@chromium.org, Sep 22 2017

Owner: duenasa@google.com

Comment 7 by xixuan@chromium.org, Sep 22 2017

Ok, unlock all of them. Should be fine now.

@duenasa Could you tell why you lock all of these pool:cq kevin DUTs?

Comment 8 by duenasa@google.com, Sep 22 2017

The devices that are locked are DVT devices that will be decommissioned. There are PVT and some MP devices in chromeos6. 

Comment 9 by xixuan@chromium.org, Sep 22 2017

Cc: duenasa@google.com
Owner: xixuan@chromium.org
Sure, I will move 10 DUTs from chromeos6 and pool:suites to pool:cq. Before that please don't lock the DUTs. I will assign this back to you after the move is done, then you can lock them.

Next time before decommissioning and locking any pool:cq/pool:bvt DUTs, please first contact ChromeOS deputy to confirm there're enough spare DUTs for them in the same pool. Thanks!


Status: Fixed (was: Untriaged)
All DUTs in pool:cq now are in chromeos6. I already lock all previous chromeos2 cq DUTs. Mark this as fixed.

Sign in to add a comment