Issue metadata
Sign in to add a comment
|
puppet dashboard not showing failing resources |
||||||||||||||||||
Issue descriptionhttps://viceroy.corp.google.com/chromeos/puppet See issue 817645 for context. I hit this problem a few times when trying to deploy the new SSH keys. I expected to see failing puppet runs on this dashboard. In this case, it was critical that all devserverse receive the new public keys before rotating keys, but the dashboard didn't help me determine that. What gives?
,
Mar 8 2018
Yep, the ask here is for a dashboard that would let us detect when we start failing more resources. An easier partial solution would be a dashboard to show sudden spike in number of failed resources. This happens when a very basic step fails (say, in my case something in profiles/base) because it leads to a lot of dependent packages to be skipped. So, if we tracked total number of packages failed + skipped, we'd see a sudden spike. But there are also failure modes where an important but terminal package fails. Detecting this would require us to clearly flag _any_ failed packages if they start failing consistently. So (1) A dashboard for # skipped + failed packages changing suddenly. (2) A dashboard for # packages failing consistently (i.e., no pass in ~4 hours?)
,
Mar 9 2018
,
Mar 9 2018
,
Jan 10
Downgrading P2s that haven't been modified in more than 6 months, which have no component or owner. |
|||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||
Comment 1 by ayatane@chromium.org
, Mar 8 2018