New issue
Advanced search Search tips

Issue 923737 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Today
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 0
Type: Bug



Sign in to add a comment

hwtest is timing out repeatedly in the same set of paladins

Project Member Reported by semenzato@chromium.org, Jan 20 (2 days ago)

Issue description

The same set of cq builders have timed out in the last two runs, with little additional information.

The set includes:

auron_payne
auron_yuna
bob
caroline-arcnext
cave
cyan
edgar
elm
eve
gale
guado_moblab
hana
kevin-arcnext
kevin
kip
nocturne
nyan_big
nyan_kitty
peach_pit
peppy
scarlet
tidus
veyron_mighty
veyron_minnie
veyron_speedy
winky
wizpig
wolf

Most other paladins succeeded.

The tests don't seem to even get started.

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8923814024426578080

************************************************************
** Start Stage HWTest [provision] - Sun, 20 Jan 2019 05:00:41 -0800 (PST)
** 
** Stage that runs tests in the Autotest lab.
************************************************************
05:00:41: INFO: Created cidb engine bot@cosmic-strategy-646:cidb-gen2 for pid 12655
05:00:41: INFO: Running cidb query on pid 12655, repr(query) starts with <sqlalchemy.sql.expression.Update object at 0x7f37efaa3550>
05:00:41: INFO: Waiting up to forever for payloads and test artifacts ...
Preconditions for the stage successfully met. Beginning to execute stage...
05:05:54: INFO: Running cidb query on pid 12655, repr(query) starts with <sqlalchemy.sql.expression.Update object at 0x7f37eeefba90>
05:05:54: INFO: Re-run swarming_cmd to avoid buildbot salency check.
05:05:54: INFO: RunCommand: /b/swarming/w/ir/cache/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /b/swarming/w/ir/tmp/t/cbuildbot-tmpXGmlCW/tmpHgNNIF/temp_summary.json --print-status-updates --timeout 9000 --raw-cmd --task-name kevin-paladin/R73-11617.0.0-rc1-provision --dimension os Ubuntu-14.04 --dimension pool default --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=priority:CQ' '--tags=suite:provision' '--tags=build:kevin-paladin/R73-11617.0.0-rc1' '--tags=task_name:kevin-paladin/R73-11617.0.0-rc1-provision' '--tags=board:kevin' -- /usr/local/autotest/site_utils/run_suite.py --build kevin-paladin/R73-11617.0.0-rc1 --board kevin --suite_name provision --pool cq --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 5 --minimum_duts 4 --suite_args "{u'num_required': 1}" --offload_failures_only False --job_keyvals "{'cidb_build_stage_id': 108032992L, 'cidb_build_id': 3380273, 'datastore_parent_key': ('Build', 3380273, 'BuildStage', 108032992L)}" --test_args "{'fast': 'True'}" -c
Autotest instance created: cautotest-prod
01-20-2019 [05:06:00] Submitted create_suite_job rpc
01-20-2019 [05:06:12] Created suite job: http://cautotest-prod/afe/#tab_id=view_job&object_id=278589926
--create_and_return was specified, terminating now.
Will return from run_suite with status: OK
05:06:15: INFO: Re-run swarming_cmd to avoid buildbot salency check.
05:06:15: INFO: RunCommand: /b/swarming/w/ir/cache/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /b/swarming/w/ir/tmp/t/cbuildbot-tmpXGmlCW/tmpibonlI/temp_summary.json --print-status-updates --timeout 9000 --raw-cmd --task-name kevin-paladin/R73-11617.0.0-rc1-provision --dimension os Ubuntu-14.04 --dimension pool default --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=priority:CQ' '--tags=suite:provision' '--tags=build:kevin-paladin/R73-11617.0.0-rc1' '--tags=task_name:kevin-paladin/R73-11617.0.0-rc1-provision' '--tags=board:kevin' -- /usr/local/autotest/site_utils/run_suite.py --build kevin-paladin/R73-11617.0.0-rc1 --board kevin --suite_name provision --pool cq --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 5 --minimum_duts 4 --suite_args "{u'num_required': 1}" --offload_failures_only False --job_keyvals "{'cidb_build_stage_id': 108032992L, 'cidb_build_id': 3380273, 'datastore_parent_key': ('Build', 3380273, 'BuildStage', 108032992L)}" --test_args "{'fast': 'True'}" -m 278589926
05:48:06: INFO: OAuth token TTL expired, auto-refreshing (attempt 1/2)
05:48:06: INFO: Refreshing access_token
05:52:39: INFO: OAuth token TTL expired, auto-refreshing (attempt 1/2)
05:52:39: INFO: Refreshing access_token
05:57:30: INFO: OAuth token TTL expired, auto-refreshing (attempt 1/2)
05:57:30: INFO: Refreshing access_token
06:43:16: INFO: Re-run swarming_cmd to avoid buildbot salency check.
06:43:16: INFO: RunCommand: /b/swarming/w/ir/cache/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /b/swarming/w/ir/tmp/t/cbuildbot-tmpXGmlCW/tmpibonlI/temp_summary.json --print-status-updates --timeout 9000 --raw-cmd --task-name kevin-paladin/R73-11617.0.0-rc1-provision --dimension os Ubuntu-14.04 --dimension pool default --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=priority:CQ' '--tags=suite:provision' '--tags=build:kevin-paladin/R73-11617.0.0-rc1' '--tags=task_name:kevin-paladin/R73-11617.0.0-rc1-provision' '--tags=board:kevin' -- /usr/local/autotest/site_utils/run_suite.py --build kevin-paladin/R73-11617.0.0-rc1 --board kevin --suite_name provision --pool cq --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 5 --minimum_duts 4 --suite_args "{u'num_required': 1}" --offload_failures_only False --job_keyvals "{'cidb_build_stage_id': 108032992L, 'cidb_build_id': 3380273, 'datastore_parent_key': ('Build', 3380273, 'BuildStage', 108032992L)}" --test_args "{'fast': 'True'}" -m 278589926
06:48:10: INFO: OAuth token TTL expired, auto-refreshing (attempt 1/2)
06:48:10: INFO: Refreshing access_token
06:52:43: INFO: OAuth token TTL expired, auto-refreshing (attempt 1/2)
06:52:43: INFO: Refreshing access_token
06:57:38: INFO: OAuth token TTL expired, auto-refreshing (attempt 1/2)
06:57:38: INFO: Refreshing access_token
07:03:32: WARNING: Exception is not retriable return code: 1; command: /b/swarming/w/ir/cache/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /b/swarming/w/ir/tmp/t/cbuildbot-tmpXGmlCW/tmpibonlI/temp_summary.json --print-status-updates --timeout 9000 --raw-cmd --task-name kevin-paladin/R73-11617.0.0-rc1-provision --dimension os Ubuntu-14.04 --dimension pool default --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=priority:CQ' '--tags=suite:provision' '--tags=build:kevin-paladin/R73-11617.0.0-rc1' '--tags=task_name:kevin-paladin/R73-11617.0.0-rc1-provision' '--tags=board:kevin' -- /usr/local/autotest/site_utils/run_suite.py --build kevin-paladin/R73-11617.0.0-rc1 --board kevin --suite_name provision --pool cq --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 5 --minimum_duts 4 --suite_args "{u'num_required': 1}" --offload_failures_only False --job_keyvals "{'cidb_build_stage_id': 108032992L, 'cidb_build_id': 3380273, 'datastore_parent_key': ('Build', 3380273, 'BuildStage', 108032992L)}" --test_args "{'fast': 'True'}" -m 278589926
Triggered task: kevin-paladin/R73-11617.0.0-rc1-provision
Waiting for results from the following shards: 0
N/A: 428488c2fe627210 None

cmd=['/b/swarming/w/ir/cache/cbuild/repository/chromite/third_party/swarming.client/swarming.py', 'run', '--swarming', 'chromeos-proxy.appspot.com', '--task-summary-json', '/b/swarming/w/ir/tmp/t/cbuildbot-tmpXGmlCW/tmpibonlI/temp_summary.json', '--print-status-updates', '--timeout', '9000', '--raw-cmd', '--task-name', u'kevin-paladin/R73-11617.0.0-rc1-provision', '--dimension', 'os', 'Ubuntu-14.04', '--dimension', 'pool', 'default', '--io-timeout', '9000', '--hard-timeout', '9000', '--expiration', '1200', u'--tags=priority:CQ', u'--tags=suite:provision', u'--tags=build:kevin-paladin/R73-11617.0.0-rc1', u'--tags=task_name:kevin-paladin/R73-11617.0.0-rc1-provision', u'--tags=board:kevin', '--', '/usr/local/autotest/site_utils/run_suite.py', '--build', u'kevin-paladin/R73-11617.0.0-rc1', '--board', u'kevin', '--suite_name', u'provision', '--pool', u'cq', '--file_bugs', 'False', '--priority', 'CQ', '--timeout_mins', '90', '--retry', 'True', '--max_retries', '5', '--minimum_duts', '4', '--suite_args', "{u'num_required': 1}", '--offload_failures_only', 'False', '--job_keyvals', "{'cidb_build_stage_id': 108032992L, 'cidb_build_id': 3380273, 'datastore_parent_key': ('Build', 3380273, 'BuildStage', 108032992L)}", '--test_args', "{'fast': 'True'}", '-m', '278589926']
07:03:32: INFO: No json dump found, no HWTest results to report
07:03:32: INFO: Running cidb query on pid 12655, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7f37eef046d0>

07:03:32: ERROR: ** HWTest failed (code 1) **

 

Comment 1 by semenzato@chromium.org, Jan 20 (2 days ago)

Labels: -Pri-1 Pri-0
Making pri-0 because this needs attention first.

Comment 2 by jclinton@chromium.org, Jan 20 (2 days ago)

Components: -Infra>ChromeOS>CI
Labels: OS-Chrome
Owner: akes...@chromium.org
Status: Assigned (was: Untriaged)

Comment 3 by akes...@chromium.org, Today (11 hours ago)

Owner: zamorzaev@chromium.org
Possibly due to Issue 924181 which is being addressed.

Handing off to incoming deputy.

Comment 4 by zamorzaev@chromium.org, Today (11 hours ago)

Looks like issue 924181 started around 8pm on Fri, so it's likely that tests stopped being scheduled at that point.

Comment 5 by zamorzaev@chromium.org, Today (11 hours ago)

The tests should be sheduling now.

Comment 6 by zamorzaev@chromium.org, Today (10 hours ago)

Labels: Hotlist-Deputy
Status: Fixed (was: Assigned)
The lab servers are recovering. Builder failures still possible until the lab servers work through the backlog of tasks that accumulated since the start of the outage.

Sign in to add a comment