Monarch for builder-level performance |
||
Issue descriptionWe need to capture a basic level of visibility into overall builder performance and failure rates to inform both CI oncaller rotation visibility and SRE support, when needed. This item is capturing refreshing our ChromeOS Monarch presence, using the right data sources (LUCI instead of CBuildBot), and ensuring that our simple, basic Viceroy dashboard is adjusted appropriately. We should also audit what alerts are in place and refresh/align to oncall needs. We will not—at this time—migrate to a new Monarch data silo. Instead, we should look for the majority of the information that we need from LUCI once we embrace their domain model. This item is blocked on exporting our real builder names to LUCI.
,
Dec 6
Estimate is based on the information being available via LUCI metric collections. Once available we'll be able to provide quick views to overall pool health as well as proactive monitoring for individual bots. This bug will track the validation, dashboards, and monitoring efforts.
,
Dec 6
|
||
►
Sign in to add a comment |
||
Comment 1 by jclinton@chromium.org
, Nov 13