WorkerThread.Task.Time is Overflowing |
||||
Issue descriptionThe WorkerThread.Task.Time sparse histogram is being incremented so frequently that it sometimes overflows the 31-bit sample count and rolls negative. Desktop does uploads about every 30 minutes (and resets the counters) so whatever events these are, it's happening on the order of 1M times per second on some occasions. https://uma.googleplex.com/p/chrome/histograms/?endDate=20180205&dayCount=1&histograms=WorkerThread.Task.Time&fixupData=true&showMax=true&filters=channel%2Ceq%2C3%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial
,
Feb 6 2018
One approach might be to std::max(time_in_microseconds, std::numeric_limits<base::HistogramBase::Sample>::max()); another might be to replace the histogram with one whose units are less likely to overflow (but this would lose precision). Assigning to nhiroki@, the owner of this histogram.
,
Feb 7 2018
From comment #0 it sounds like the bug isn't that the recorded value is overflowing, it's that the histogram is being logged more times than we can handle? bcwhite: I wonder if this is really P3, is overflowing benign? Does it have side-effects, indicate perf issues, or mean we can't trust the data? The metric was added in 2016 in https://bugs.chromium.org/p/chromium/issues/detail?id=616721#c6, it seems maybe we can just delete the metric if we're not using it anymore.
,
Feb 7 2018
Limiting the add amount won't really help because it's the total adds between reporting intervals that causes the overflow. In other instances where Bytes were being counted, we changed them to KiB with probabilistic rounding so that small increments still got counted. I'm in the process of making that reusable; I could add an "AddKilo" that does the same thing. You still lose precision but accuracy shouldn't change significantly.
,
Feb 7 2018
The oveflow is benign except that it gets reported (work for me) and causes 4B differences in counts on your own histograms. So really, it mostly impacts you and those reading your histograms.
,
Feb 7 2018
I've just landed https://chromium-review.googlesource.com/c/chromium/src/+/905148 Which adds AddScaled(), AddKilo(), and AddKiB() methods in the base histogram class to increment the count by a scaled amount. Precision is lost but accuracy is maintained. You can't just change the Add* method used, though; the name of the histogram needs to be changed (add a "K" suffix?) to avoid confusion.
,
Feb 9 2018
We will probably delete this as it's not currently being used and seems like it can give misleading data due to the overflow.
,
Feb 9 2018
I'll go ahead and remove it.
,
Feb 13 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0335b2141d9938dfb8417f600165fcc9b2addf17 commit 0335b2141d9938dfb8417f600165fcc9b2addf17 Author: Matt Falkenhagen <falken@chromium.org> Date: Tue Feb 13 06:09:00 2018 Remove WorkerThread.Task.Time histogram. This is overflowing causing confusion. Since we don't use it we can delete it. Also delete the related histogram WorkerThread.DebuggerTask.Time. Bug: 616721 , 809672 Change-Id: I33895fdf033ab2c07ccecfb05c75085a45569011 Reviewed-on: https://chromium-review.googlesource.com/911229 Reviewed-by: Kinuko Yasuda <kinuko@chromium.org> Reviewed-by: Ilya Sherman <isherman@chromium.org> Reviewed-by: Hiroki Nakagawa <nhiroki@chromium.org> Commit-Queue: Matt Falkenhagen <falken@chromium.org> Cr-Commit-Position: refs/heads/master@{#536264} [modify] https://crrev.com/0335b2141d9938dfb8417f600165fcc9b2addf17/third_party/WebKit/Source/core/workers/WorkerThread.cpp [modify] https://crrev.com/0335b2141d9938dfb8417f600165fcc9b2addf17/third_party/WebKit/Source/platform/scheduler/child/worker_scheduler_impl.cc [modify] https://crrev.com/0335b2141d9938dfb8417f600165fcc9b2addf17/tools/metrics/histograms/histograms.xml
,
Feb 14 2018
bcwhite: thanks for reporting and adding the new APIs. |
||||
►
Sign in to add a comment |
||||
Comment 1 by bcwh...@chromium.org
, Feb 6 2018