Issue metadata
Sign in to add a comment
|
Measuring error rate of memory-infra. |
||||||||||||||||||||||
Issue descriptionI recall that hjd added the metrics: Memory.Experimental.Debug.FailedProcessDumpsPerGlobalDump Memory.Experimental.Debug.GlobalDumpDuration Memory.Experimental.Debug.GlobalDumpQueueLength Given that I ran into a memory-infra error, I figured I'd take a look at these metrics to see if they caught the problem: https://bugs.chromium.org/p/chromium/issues/detail?id=771805 It turns out that only Memory.Experimental.Debug.GlobalDumpQueueLength is emitted. The other metrics are not emitted if the memory-infra process is hung. Looking at this metric on beta, ~1% of the time the queue length is >0 https://uma.googleplex.com/p/chrome/histograms/?endDate=20171003&dayCount=1&histograms=Memory.Experimental.Debug.GlobalDumpQueueLength&fixupData=true&showMax=true&filters=platform%2Ceq%2CW%2Cchannel%2Ceq%2C3%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial Given that there's currently only 2 users of queue length [tracing, memory-uma], it seems likely that every instance for queue length > 0 is indicative of something gone wrong. Since this stat is emitted every time we go from 0->1 queue length, the number of 1-length queues is indicative of the prevalence of the problem. ~0.1%. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||