(thought there was an existing bug about this, but couldn't find it, so filing)
When a server is removed from server_db, sysmon on the master notices and stops actively setting the prod role metric for that server.
However, ts_mon continues to latch the previous gauge value until the next time sysmon is restarted, resulting in a potentially very long window in which we report incorrect prodrole information to monarch. This results in prod-role-based alerts being sent about servers that are no longer in production.
-> phobbs who discovered this behavior and is working on a fix
Comment 1 by pho...@chromium.org
, Sep 26 2017Status: Duplicate (was: Untriaged)