Unexplained string of timeouts in CtsAccelerationTestCases |
||||||
Issue descriptionEg: https://luci-milo.appspot.com/buildbot/chromeos/auron_paine-release/2031 https://luci-milo.appspot.com/buildbot/chromeos/coral-release/848 https://luci-milo.appspot.com/buildbot/chromeos/fizz-release/1093 https://luci-milo.appspot.com/buildbot/chromeos/nautilus-release/470 I'm not sure if this info is meaningful: Suite job [ FAILED ] Suite job ABORT: ... [1;31m01:18:34: ERROR: wait_cmd has lab failures: cmd=['/b/c/cbuild/repository/chromite/third_party/swarming.client/swarming.py', 'run', '--swarming', 'chromeos-proxy.appspot.com', '--task-summary-json', '/tmp/cbuildbot-tmpphVNDr/tmpH7Tsml/temp_summary.json', '--raw-cmd', '--task-name', u'nautilus-release/R67-10503.0.0-bvt-arc', '--dimension', 'os', 'Ubuntu-14.04', '--dimension', 'pool', 'default', '--print-status-updates', '--timeout', '14400', '--io-timeout', '14400', '--hard-timeout', '14400', '--expiration', '1200', u'--tags=priority:Build', u'--tags=suite:bvt-arc', u'--tags=build:nautilus-release/R67-10503.0.0', u'--tags=task_name:nautilus-release/R67-10503.0.0-bvt-arc', u'--tags=board:nautilus', '--', '/usr/local/autotest/site_utils/run_suite.py', '--build', u'nautilus-release/R67-10503.0.0', '--board', u'nautilus', '--suite_name', u'bvt-arc', '--pool', u'bvt', '--file_bugs', 'True', '--priority', 'Build', '--timeout_mins', '180', '--retry', 'True', '--max_retries', '5', '--minimum_duts', '4', '--suite_min_duts', '6', '--offload_failures_only', 'False', '--job_keyvals', "{'cidb_build_stage_id': 73905518L, 'cidb_build_id': 2396732, 'datastore_parent_key': ('Build', 2396732, 'BuildStage', 73905518L)}", '-m', '184953758'].
,
Mar 20 2018
> Suite job [ FAILED ] > Suite job ABORT: This is the proximate cause of the failure. It means that the suite job timed out and aborted. Normally, the logs that follow should show which individual tests were affected. The output from run_suite shows only PASSED test results, but that's a lie; if you look at the actual suite jobs (not the output from run_suite) you find that in every case, cheets_CTS_N.7.1_r15.x86.CtsAccelerationTestCases aborted. That's likely the cause of the overall suite timeout. So, the root cause of the problem is a failure in the CTS acceleration test. Secondarily, there's a bug that our logs didn't actually tell us so. Given that the problem went away quickly, the first question to ask would be "did anyone find and revert a change causing that sort of failure?"
,
Mar 20 2018
For reference, the auron_paine suite job that aborted:
http://cautotest-prod/afe/#tab_id=view_job&object_id=184954122
And the job for the failed acceleration test:
http://cautotest-prod/afe/#tab_id=view_job&object_id=184954238
,
Mar 22 2018
,
Mar 22 2018
Updating the summary to reflect the symptom better.
,
Mar 26 2018
,
Mar 27 2018
Since test is passing for the while, I assume it was fully recovered. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by sha...@chromium.org
, Mar 20 2018