New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 814347 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Closed: Jul 20
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

falco_li-release: NotEnoughDutsError

Project Member Reported by norvez@chromium.org, Feb 21 2018

Issue description


Failing for ~2 weeks: https://luci-milo.appspot.com/buildbot/chromeos/falco_li-release/?limit=200

"
05:04:53: WARNING: Exception is not retriable return code: 3; command: /b/c/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmphmSruj/tmpgOu64B/temp_summary.json --raw-cmd --task-name falco_li-release/R66-10424.0.0-bvt-inline --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 14400 --io-timeout 14400 --hard-timeout 14400 --expiration 1200 '--tags=priority:Build' '--tags=suite:bvt-inline' '--tags=build:falco_li-release/R66-10424.0.0' '--tags=task_name:falco_li-release/R66-10424.0.0-bvt-inline' '--tags=board:falco_li' -- /usr/local/autotest/site_utils/run_suite.py --build falco_li-release/R66-10424.0.0 --board falco_li --suite_name bvt-inline --pool bvt --file_bugs True --priority Build --timeout_mins 180 --retry True --max_retries 5 --minimum_duts 4 --suite_min_duts 6 --offload_failures_only False --job_keyvals "{'cidb_build_stage_id': 71307843L, 'cidb_build_id': 2317924, 'datastore_parent_key': ('Build', 2317924, 'BuildStage', 71307843L)}" -c
Priority was reset to 100
Triggered task: falco_li-release/R66-10424.0.0-bvt-inline
chromeos-golo-server1-84: 3bd148f226c28610 3
  Autotest instance created: cautotest-prod
  TestLabException: Not enough DUTs for board: falco_li, pool: bvt; required: 4, found: 3
  Traceback (most recent call last):
    File "/usr/local/autotest/site_utils/run_suite.py", line 2034, in _run_task
      return _run_suite(options)
    File "/usr/local/autotest/site_utils/run_suite.py", line 1775, in _run_suite
      options.skip_duts_check)
    File "/usr/local/autotest/site_utils/diagnosis_utils.py", line 330, in check_dut_availability
      hosts=hosts)
  NotEnoughDutsError: Not enough DUTs for board: falco_li, pool: bvt; required: 4, found: 3
  Will return from run_suite with status: INFRA_FAILURE
cmd=['/b/c/cbuild/repository/chromite/third_party/swarming.client/swarming.py', 'run', '--swarming', 'chromeos-proxy.appspot.com', '--task-summary-json', '/tmp/cbuildbot-tmphmSruj/tmpgOu64B/temp_summary.json', '--raw-cmd', '--task-name', u'falco_li-release/R66-10424.0.0-bvt-inline', '--dimension', 'os', 'Ubuntu-14.04', '--dimension', 'pool', 'default', '--print-status-updates', '--timeout', '14400', '--io-timeout', '14400', '--hard-timeout', '14400', '--expiration', '1200', u'--tags=priority:Build', u'--tags=suite:bvt-inline', u'--tags=build:falco_li-release/R66-10424.0.0', u'--tags=task_name:falco_li-release/R66-10424.0.0-bvt-inline', u'--tags=board:falco_li', '--', '/usr/local/autotest/site_utils/run_suite.py', '--build', u'falco_li-release/R66-10424.0.0', '--board', u'falco_li', '--suite_name', u'bvt-inline', '--pool', u'bvt', '--file_bugs', 'True', '--priority', 'Build', '--timeout_mins', '180', '--retry', 'True', '--max_retries', '5', '--minimum_duts', '4', '--suite_min_duts', '6', '--offload_failures_only', 'False', '--job_keyvals', "{'cidb_build_stage_id': 71307843L, 'cidb_build_id': 2317924, 'datastore_parent_key': ('Build', 2317924, 'BuildStage', 71307843L)}", '-c']
Autotest instance created: cautotest-prod
TestLabException: Not enough DUTs for board: falco_li, pool: bvt; required: 4, found: 3
Traceback (most recent call last):
  File "/usr/local/autotest/site_utils/run_suite.py", line 2034, in _run_task
    return _run_suite(options)
  File "/usr/local/autotest/site_utils/run_suite.py", line 1775, in _run_suite
    options.skip_duts_check)
  File "/usr/local/autotest/site_utils/diagnosis_utils.py", line 330, in check_dut_availability
    hosts=hosts)
NotEnoughDutsError: Not enough DUTs for board: falco_li, pool: bvt; required: 4, found: 3
Will return from run_suite with status: INFRA_FAILURE
"
 
Owner: pprabhu@chromium.org
Cc: matthewmwang@chromium.org ejcaruso@chromium.org
Owner: nxia@chromium.org

Comment 3 by nxia@chromium.org, Mar 3 2018

chromeos6-row1-rack11-host5 is in "repairing" status but it hasn't run any job since 2018-02-22

Comment 4 by nxia@chromium.org, Mar 3 2018

chromeos6-row1-rack11-host15 has been failing to verify and repair itself. 

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos6-row1-rack11-host15/274001-repair


Comment 5 by nxia@chromium.org, Mar 3 2018

Cc: dgarr...@chromium.org
filed the repair ticket at b/74127716.

Comment 6 by nxia@chromium.org, Mar 5 2018

Owner: pprabhu@chromium.org
The lab is lack of healthy falco_li boards, please work with the lab to bring up more healthy falco_li boards.
The problem went deeper than "not enough working DUTs".  Just now,
there were only three DUTs (all working) in the bvt pool.  The standard
minimum is 6 DUTs.  See here:
    https://sites.google.com/a/google.com/chromeos/for-team-members/infrastructure/chromeos-admin/creating-pools

So, even after DUTs were repaired, the builder stayed red.

I resized the pool to 6 DUTs; here's the full current status:
$ for p in bvt suites; do dut-status -b falco_li -p $p; done
hostname                       S   last checked         URL
chromeos6-row1-rack11-host5    OK  2018-03-23 07:10:38  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host5/292197-reset/
chromeos6-row1-rack11-host7    OK  2018-03-23 10:34:50  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host7/292301-repair/
chromeos6-row1-rack11-host13   OK  2018-03-23 13:54:13  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host13/292376-provision/
chromeos6-row1-rack11-host15   OK  2018-03-23 06:59:04  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host15/292187-reset/
chromeos6-row1-rack11-host17   OK  2018-03-23 10:36:09  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host17/292304-reset/
chromeos6-row1-rack11-host19   OK  2018-03-23 10:35:21  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host19/292303-repair/
hostname                       S   last checked         URL
chromeos6-row1-rack11-host11   NO  2018-01-30 12:05:38  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host11/226198-repair/
chromeos6-row1-rack11-host9    NO  2018-03-23 14:46:38  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host9/292398-repair/
chromeos6-row1-rack11-host21   NO  2018-03-23 14:46:27  http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack11-host21/292397-repair/

Components: Infra>Client>ChromeOS>CI
Components: -Infra>Client>ChromeOS
Status: Archived (was: Assigned)

Sign in to add a comment