ShardRemoveFromProductionMasterTask did not mark servers as repair_required in server_db |
|||||
Issue descriptionA bunch of shards were retired as part of issue 753890 But these servers were left behind in server_db as primary shards, causing push-to-prod to fail. (This would also cause our metrics dashboards to show incorrect graphs, since they will be considered to be in prod. I'd even expect shard down alerts to start firing) Details from https://bugs.chromium.org/p/chromium/issues/detail?id=753890#c20 pprabhu@pprabhu:files$ atest server list chromeos-server42.cbf.corp.google.com Hostname : chromeos-server42.cbf.corp.google.com Status : primary Roles : shard Attributes : {} Date Created : 2016-04-18 11:04:57 Date Modified: 2016-04-18 11:04:57 Note : None pprabhu@pprabhu:files$ atest server list chromeos-server43.cbf.corp.google.com Hostname : chromeos-server43.cbf.corp.google.com Status : primary Roles : shard Attributes : {} Date Created : 2016-04-18 13:13:20 Date Modified: 2016-04-18 13:13:20 Note : None pprabhu@pprabhu:files$ atest server list chromeos-server44.cbf.corp.google.com Hostname : chromeos-server44.cbf.corp.google.com Status : primary Roles : shard Attributes : {} Date Created : 2016-04-20 15:30:31 Date Modified: 2016-04-20 15:30:31 Note : None pprabhu@pprabhu:files$ atest server list chromeos-server45.cbf.corp.google.com Hostname : chromeos-server45.cbf.corp.google.com Status : primary Roles : shard Attributes : {} Date Created : 2016-04-26 11:16:28 Date Modified: 2016-04-26 11:16:28 Note : None These should not have been in server_db
,
Aug 21 2017
ShardRemove... task will mark the server as repair_required. These servers must be manually removed, so the server_db is not updated.
,
Aug 21 2017
In this case, they were still marked primary. I had to manually remove them because they broke push-to-prod.
,
Aug 21 2017
,
Aug 21 2017
These servers are not removed by the task. For example, chromeos-server72.cbf was removed by the task, so it is marked as 'repair_required'. There is no action for the task itself here. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by pprabhu@chromium.org
, Aug 21 2017Status: Assigned (was: Untriaged)