The Chrome OS commit queue has been failing almost every run since build #17269: https://luci-milo.appspot.com/buildbot/chromeos/master-paladin/17269
There were a few failures just before this, but I think they were a different problem. Since then, nearly all of the child builders die with "aborted by self-destruction".
I don't see any meaningful errors on the child builders. For example, at https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/4833, the "HWTest [provision]" stage passed, but "HWTest [bvt-arc]" failed after 30 minutes. The job's logs are extremely short and just end with this:
12-22-2017 [08:41:23] Created suite job: http://cautotest-prod.corp.google.com/afe/#tab_id=view_job&object_id=164561876
--create_and_return was specified, terminating now.
Will return from run_suite with status: OK
08:41:23: INFO: Re-run swarming_cmd to avoid buildbot salency check.
08:41:23: INFO: RunCommand: /b/c/cbuild/repository/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmpZMqkiO/tmp2c_S5h/temp_summary.json --raw-cmd --task-name cyan-paladin/R65-10239.0.0-rc2-bvt-arc --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 9000 --io-timeout 9000 --hard-timeout 9000 --expiration 1200 '--tags=priority:CQ' '--tags=suite:bvt-arc' '--tags=build:cyan-paladin/R65-10239.0.0-rc2' '--tags=task_name:cyan-paladin/R65-10239.0.0-rc2-bvt-arc' '--tags=board:cyan' -- /usr/local/autotest/site_utils/run_suite.py --build cyan-paladin/R65-10239.0.0-rc2 --board cyan --suite_name suite_attr_wrapper --pool cq --file_bugs False --priority CQ --timeout_mins 90 --retry True --max_retries 5 --minimum_duts 4 --offload_failures_only False --suite_args "{'attr_filter': u'(suite:bvt-arc) and (subsystem:default)'}" --job_keyvals "{'cidb_build_stage_id': 65928655L, 'cidb_build_id': 2153542, 'datastore_parent_key': ('Build', 2153542, 'BuildStage', 65928655L)}" --test_args "{'fast': 'True'}" -m 164561876
The job is at http://cautotest-prod.corp.google.com/afe/#tab_id=view_job&object_id=164561876, but it shows a status of "1 Completed", and I don't see any obvious problems there either.
The CQ has been passing occasionally since these failures started, e.g. https://luci-milo.appspot.com/buildbot/chromeos/master-paladin/17301, but when I look at those runs, there's usually still a bunch of "aborted by self-destruction" messages from the paladin builders. It's not clear to my why these runs have different outcomes from the others.
Comment 1 by jcliang@chromium.org
, Dec 25 2017