At least some boards, some of the time, are regularly running 'cleanup' jobs in addition to 'reset' jobs. Here's a short history from an eve BVT pool DUT: chromeos6-row3-rack11-host3 2018-02-20 08:26:34 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row3-rack11-host3/151945-reset/ 2018-02-20 08:25:38 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row3-rack11-host3/151939-cleanup/ 2018-02-20 08:24:46 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178028843-chromeos-test/ 2018-02-20 08:24:19 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row3-rack11-host3/151934-reset/ 2018-02-20 08:23:23 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row3-rack11-host3/151926-cleanup/ 2018-02-20 08:22:13 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178028821-chromeos-test/ Same story, but for bob: chromeos2-row8-rack11-host14 2018-02-20 08:17:11 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack11-host14/216248-reset/ 2018-02-20 08:16:31 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack11-host14/216240-cleanup/ 2018-02-20 08:07:31 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178022107-chromeos-test/ 2018-02-20 08:07:03 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack11-host14/216170-reset/ 2018-02-20 08:06:26 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row8-rack11-host14/216161-cleanup/ 2018-02-20 07:57:43 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178022065-chromeos-test/ I'm not sure how widespread this symptom is, but I've seen it anecdotally in the past. The fact that I was able to see it for two randomly selected DUTs makes me think that it's _very_ widespread. The cleanup jobs take extra time, and are likely seriously straining our DUT capacity, which will lead to tests being dropped or aborted.
Passing to this week's primary deputy.
The decision about whether to run a cleanup job falls to the scheduler, and is based on the return status from the test job. These days, I think lucifer is involved in that process.
I did a spot check on hana and bob DUTs in the CQ pool. The story there is slightly different. Here's a sample for bob: chromeos6-row4-rack13-host11 2018-02-20 08:42:24 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/216453-cleanup/ 2018-02-20 08:41:24 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178098433-chromeos-test/ 2018-02-20 08:35:24 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/216383-provision/ 2018-02-20 05:25:37 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/214879-reset/ 2018-02-20 05:24:09 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178065861-chromeos-test/ 2018-02-20 05:23:38 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/214861-reset/ 2018-02-20 05:21:38 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178065845-chromeos-test/ 2018-02-20 05:21:04 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/214837-reset/ 2018-02-20 05:19:05 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178065827-chromeos-test/ 2018-02-20 05:18:32 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/214808-reset/ 2018-02-20 05:16:33 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178065819-chromeos-test/ 2018-02-20 05:16:02 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/214788-reset/ 2018-02-20 05:13:10 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178065811-chromeos-test/ 2018-02-20 05:05:49 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/214692-cleanup/ 2018-02-20 05:04:52 -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/178059999-chromeos-test/ 2018-02-20 04:58:42 OK http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row4-rack13-host11/214633-provision/ The hana DUTs I checked were similar. The key feature is the sequence of "provision", "cleanup", and then only "reset" afterward (not "cleanup and reset").
Passing to Allen for investigation as the new scheduler expert. ;>
Stupid type error https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/927686
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/84fc40f03c54b064cab25a8b059ed9d997b6bac1 commit 84fc40f03c54b064cab25a8b059ed9d997b6bac1 Author: Allen Li <ayatane@chromium.org> Date: Thu Feb 22 22:28:14 2018 [autotest] Fix type error BUG= chromium:813811 TEST=None Change-Id: I0cbd760c9a598be54b28ffb9b7e3abedbc4b961a Reviewed-on: https://chromium-review.googlesource.com/927686 Commit-Ready: Allen Li <ayatane@chromium.org> Tested-by: Allen Li <ayatane@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [modify] https://crrev.com/84fc40f03c54b064cab25a8b059ed9d997b6bac1/venv/lucifer/handlers.py
Think this is fixed, need verify
Comment 1 by jrbarnette@chromium.org
, Feb 20 2018