Add metric tracking the number of group alerts in SOM |
|
Issue descriptionPerf benchmarking team has been spent lot of effort in stablizing our waterfall & make our sheriffs more effective. It would be nice to track how this is going by tracking the number of group alerts in SOM. There are two group alerts metrics we care about: 1) Number of group alerts of "consistent failures" 2) Number of group alerts of "new failures" Group (1) is the most important to us
,
Feb 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/71ed7ee94abe422575253d7462e60fdce21a45a9 commit 71ed7ee94abe422575253d7462e60fdce21a45a9 Author: Sean McCullough <seanmccullough@chromium.org> Date: Mon Feb 05 23:59:13 2018 [som] Add monitoring metrics for alert *groups* Also split alerts by category (new vs. consistent failures) Bug: 807344 Change-Id: Ic3e19c6fc470274c42bb3dab274259544805d1ea Reviewed-on: https://chromium-review.googlesource.com/896166 Commit-Queue: Sean McCullough <seanmccullough@chromium.org> Reviewed-by: Tiffany Zhang <zhangtiff@chromium.org> [modify] https://crrev.com/71ed7ee94abe422575253d7462e60fdce21a45a9/go/src/infra/appengine/sheriff-o-matic/som/handler/analyze.go [modify] https://crrev.com/71ed7ee94abe422575253d7462e60fdce21a45a9/go/src/infra/appengine/sheriff-o-matic/som/handler/analyze_test.go
,
Feb 22 2018
Hi Sean, since the change is landed, can we view these metrics in some graph now?
,
Feb 22 2018
We don't have a viceroy graph for it yet but you can see it in pcon: http://shortn/_X4CzHqBJaZ
,
Feb 22 2018
Awesome, thanks Sean!
,
Feb 23 2018
Hey Sean, thanks for working on this! Does this graph measure the sum of new and consistent failures?
,
Feb 26 2018
re: #6 It does if you further break the metric down by "category": http://shortn/_YBwV1Ln4Gv
,
Feb 27 2018
Great! Sounds like http://shortn/_2kzn6u2l88 is probably the graph we want then. Thanks so much for this Sean!
,
Feb 27 2018
It seems like this data is deleted after ~5d. Is there any way to get it retained for much longer (6 months to a year)? We were hoping to use this to verify improvement over long time horizons for our team. |
|
►
Sign in to add a comment |
|
Comment 1 by seanmccullough@chromium.org
, Jan 31 2018Status: Available (was: Untriaged)