I removed board:falco from cros-full-0004.mtv for issue 880913
pprabhu@pprabhu:chromiumos$ atest shard remove_board -l board:falco cros-full-0004.mtv.corp.google.com
None
The DUTs were taking a long time moving to master. Turns out, the shard is aborting old HQEs for each host, HQEs from months ago that are already completed jobs.
eg:
chromeos-test@cros-full-0004:~$ tail /usr/local/autotest/logs/shard_client.latest
09/06 12:35:33.557 INFO | models:0759| Deleting and aborting hqe chromeos6-row1-rack9-host5/183451434 (183888007)...
09/06 12:35:33.559 INFO | models:0762| ... done with hqe chromeos6-row1-rack9-host5/183451434 (183888007).
09/06 12:35:33.559 INFO | models:0759| Deleting and aborting hqe chromeos6-row1-rack9-host5/183451439 (183888012)...
09/06 12:35:33.560 INFO | models:0762| ... done with hqe chromeos6-row1-rack9-host5/183451439 (183888012).
09/06 12:35:33.560 INFO | models:0759| Deleting and aborting hqe chromeos6-row1-rack9-host5/183451459 (183888032)...
09/06 12:35:33.561 INFO | models:0762| ... done with hqe chromeos6-row1-rack9-host5/183451459 (183888032).
09/06 12:35:33.562 INFO | models:0759| Deleting and aborting hqe chromeos6-row1-rack9-host5/183451463 (183888036)...
09/06 12:35:33.563 INFO | models:0762| ... done with hqe chromeos6-row1-rack9-host5/183451463 (183888036).
09/06 12:35:33.563 INFO | models:0759| Deleting and aborting hqe chromeos6-row1-rack9-host5/183451483 (183888056)...
09/06 12:35:33.564 INFO | models:0762| ... done with hqe chromeos6-row1-rack9-host5/183451483 (183888056).
Looking at one of the jobs: http://cros-full-0004.mtv.corp.google.com/afe/#tab_id=view_job&object_id=183451434
This is a completed job from March.
I don't know the implications of this incorrect aborting of HQEs. Likely nothing, since we haven't seen the world fall apart every time we moved a board like this?
Comment 1 by pprabhu@chromium.org
, Sep 6