See go/system-unclean for a metric assessing what fraction of unclean browser shutdowns are due to Windows system crashes. It looks like the percentage is on the order of 55-60%.
As-is, we only do the assessment for users that are opted into Crumbs collection, as we're using crumbs to get a timestamp from the browser session lifetime. As this is a high-value metric, it makes sense to divorce it from Crumbs collection.
To that end, it's likely better to e.g. write the current time to Local State on every flush (and perhaps make sure it's flushed every 30-60 minutes on quiescent browsers, if that's not already the case). During stability reporting, this date can then be used as the timestamp for the assessment.
This assessment is potentially expensive, as it needs to query Windows system logs.
I'd like advice from uma-team on how the metric is best reported. Options include
- Compute this in the background and report in a histogram.
This will only allow correlating and comparing aggregates, and will be
subject to the usual sort of trouble with outlier reports, reporting lag,
etc.
- Compute this as part of stability reporting and report as histogram.
I'm not sure whether this has any advantage over the above. I'd love to
see an overview design doc for how UMA reporting works, specifically how
stability reports differ from regular reports.
- Compute and report in an existing or new stability proto field.
This will allow per-count comparison to UCS and filtering outliers.
Comment 1 by siggi@chromium.org
, Jan 5 2018