TimeTicks.FromQPCValue from base_unittests failed on Win10 |
|||||||||
Issue descriptionThis is a try job for https://chromium-review.googlesource.com/c/chromium/src/+/762853, which doesn't touch base/ https://build.chromium.org/p/tryserver.chromium.win/builders/win10_chromium_x64_rel_ng/builds/28367 [ RUN ] TimeTicks.FromQPCValue ../../base/time/time_win_unittest.cc(292): error: The difference between expected_microseconds_since_origin and converted_microseconds_since_origin is 512, which exceeds 1.0, where expected_microseconds_since_origin evaluates to 4.2934311416234112e+18, converted_microseconds_since_origin evaluates to 4.2934311416234107e+18, and 1.0 evaluates to 1. ticks=9223372036854775807, to be converted via logic path: SAFE [ FAILED ] TimeTicks.FromQPCValue (1 ms)
,
Nov 16 2017
Also, the failures generally are on the same input value. However, there was one flake where it was on a smaller value (but still using the SAFE path): [ RUN ] TimeTicks.FromQPCValue ../../base/time/time_win_unittest.cc(292): error: The difference between expected_microseconds_since_origin and converted_microseconds_since_origin is 1.00048828125, which exceeds 1.0, where expected_microseconds_since_origin evaluates to 4293921788464.0005, converted_microseconds_since_origin evaluates to 4293921788463, and 1.0 evaluates to 1. ticks=9225306323878, to be converted via logic path: SAFE
,
Nov 21 2017
This is flaking on the tryservers as well: https://ci.chromium.org/buildbot/tryserver.chromium.win/win10_chromium_x64_rel_ng/31552 However it only seems to have hit my CL ( https://chromium-review.googlesource.com/781206/ ) and hasn't shown up in the last 200 builds. Upgrading to P2 rather than P1 for flaky tests.
,
Dec 7 2017
,
Dec 7 2017
This hit me as well. Logs look like in the original report. Upgrading to P1 again, since it's been flaking quite a lot recently: https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIQsSBUZsYWtlIhZUaW1lVGlja3MuRnJvbVFQQ1ZhbHVlDA
,
Dec 7 2017
I am wondering if there was a hardware or an OS change, this code has not been changed for over 2 years. +brucedawson for thoughts
,
Dec 8 2017
This is a bug in EXPECT_NEAR. The main clue is that the difference is 512 - a suspiciously perfect power-of-two. At the range of those numbers it turns out that 512 is the smallest possible difference between two doubles. The two numbers are: 4293431141623411200.0 4293431141623410688.0 and there are no doubles between them. Their integer representations are: 4885784004140730392 4885784004140730393 At that range if you do two different calculations that should give a similar result then the results will either be identical, or will differ by some multiple of 512. So, the initial problem is a bug in EXPECT_NEAR. It should treat *adjacent* doubles as near, even if the absolute value of their difference is huge. I've blogged about this extensively: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ To reiterate, it makes no sense for EXPECT_NEAR to fail on two doubles that are literally as close to identical as possible without actually being identical. If QueryPerformanceFrequency is 4 GHz then it would take 34 years to get a time value that large. I'm tempted to retitle this bug to "TimeTicks.FromQPCValue fails if system up-time exceeds several decades". It looks like the only external dependency in this code is the result of QueryPerformanceFrequency. This varies between machines and there must be some values that cause the two results to fall on either side of a rounding cusp (half way between the two answers) when handling std::numeric_limits<int64_t>::max(). I'll take this bug. I'll modify the EXPECT_NEAR statement to print frequency so that I can reproduce the issue, then I'll fix EXPECT_NEAR.
,
Dec 8 2017
crrev.com/c/816053 should fix it - I'll tidy it up later.
,
Dec 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6c2427457b0c5ebaefa5c1a6003117ca8126e7bc commit 6c2427457b0c5ebaefa5c1a6003117ca8126e7bc Author: Bruce Dawson <brucedawson@chromium.org> Date: Fri Dec 08 21:58:50 2017 Fix epsilon calculation for large-double comparisons My whole life has been leading up to this bug fix. TimeTicks.FromQPCValue was failing on some machines and this turned out to be caused by a bad epsilon value when comparing two floating-point numbers. I have written some of the reference pages about the perils of floating-point comparisons so this bug really spoke to me: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ The problem is that when testing with a QPC value of int64_t max and a QPC frequency of 2.149252 MHz the result is around 4.29e18, and at that magnitude the precision of a double is 512.0. The test was using an epsilon of 1.0 for its EXPECT_NEAR comparison and this epsilon is meaningless for numbers of that magnitude - either the doubles will be identical or they will differ by a multiple of 512.0. The real-life implications of this bug are that if you run Chrome on a machine with an uptime of 136 millennia then if you store TimeTicks::FromQPCValue in a double you should expect less than half a millisecond of precision. I guess I should update this article to warn about the risks of using double: https://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/ The fix is to calculate the actual minimum epsilon at the magnitude of the numbers and use the maximum of that and 1.0 as the epsilon parameter to EXPECT_NEAR. I have a tentative fix to DoubleNearPredFormat so that EXPECT_NEAR will fail if passed an epsilon value that is meaninglessly small - this change would have detected the bad test: https://github.com/google/googletest/issues/1350 Bug: 786046 Change-Id: I92ee56309a0cab754dee97e11651ae12547a348e Reviewed-on: https://chromium-review.googlesource.com/816053 Commit-Queue: Bruce Dawson <brucedawson@chromium.org> Reviewed-by: Yuri Wiitala <miu@chromium.org> Cr-Commit-Position: refs/heads/master@{#522894} [modify] https://crrev.com/6c2427457b0c5ebaefa5c1a6003117ca8126e7bc/base/time/time_win_unittest.cc
,
Dec 8 2017
Fixed. And I'm refining a fix for https://github.com/google/googletest/issues/1350. The fix turns the failure we were seeing into a 100% reliable failure that says: ../../base/time/time_win_unittest.cc(292): error: The epsilon used to compare expected_microseconds_since_origin and converted_microseconds_since_origin is less than the minimum epsilon in this range, which is 512, where expected_microseconds_since_origin evaluates to 3.640989196237484e+18, converted_microseconds_since_origin evaluates to 3.640989196237484e+18, and 1.0 evaluates to 1. ticks=9223372036854775807, to be converted via logic path: SAFE
,
May 29 2018
This (or a related issue) seems to be still failing. brucedawson: Can you take a look? https://logs.chromium.org/v/?s=chromium%2Fbuildbucket%2Fcr-buildbucket.appspot.com%2F8945160291103429904%2F%2B%2Fsteps%2Fbase_unittests_on_Windows-10-15063%2F0%2Flogs%2FTimeTicks.FromQPCValue%2F0 RUN ] TimeTicks.FromQPCValue ../../base/time/time_win_unittest.cc(301): error: The difference between expected_microseconds_since_origin and converted_microseconds_since_origin is 1.00048828125, which exceeds epsilon, where expected_microseconds_since_origin evaluates to 4293358720222.0005, converted_microseconds_since_origin evaluates to 4293358720221, and epsilon evaluates to 1. ticks=9223375311707, to be converted via logic path: SAFE Stack trace: Backtrace: StackTraceGetter::CurrentStackTrace [0x00007FF6FDB84F20+80] testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop [0x00007FF6FDB9DC0A+90] testing::internal::AssertHelper::operator= [0x00007FF6FDB9D70A+90] base::TimeTicks_FromQPCValue_Test::TestBody [0x00007FF6FD8F29FC+1980] [ FAILED ] TimeTicks.FromQPCValue (23 ms)
,
Jun 4 2018
Before I made my epsilon fix (crrev.com/c/816053) we were always passing an epsilon value of 1.0. Now we pass an epsilon value of 1.0 or greater. This failure is happening when the epsilon value is 1.0 so I see no reason why this test should be failing now if it was never failing before. I've confirmed that my change did nothing but edit comments and allow for a larger value of epsilon, so therefore the test was always failing before - with this particular set of input parameters, in particular with this particular QPC frequency (2148289). I will assume that it was failing before, or at least it was always vulnerable to failing. And, that makes sense. It is comparing the results of two calculations, one of which is a perfectly calculated integral value, and the other which is a double-precision number with two operations, so it may not be perfectly rounded. The correct answer, given that ticks is 9223375311707 and ticks_per_second is 2148289 is 4293358720221.9999264530982563333 - according to the Windows calculator and Wolfram Alpha. But, with intermediate rounding to double precision the answer is 4293358720222.00048828125 (verified with a tiny program). So, with the inputs that we are getting on this particular machine this failure is expected given the current epsilon values, despite there being no bug in the conversion code being tested. So... The integral result is correct (assuming it is supposed to truncate). The floating-point result is a tiny bit high and just happens to push over the cusp. So it's not a bug, it's just a slightly too tight epsilon. I don't have any intuition for what the correct and safe epsilon would be to avoid this, and I'm not sure it deserves a detailed analysis. A minimum epsilon of 1.001 would avoid this particular glitch. I'm going to try 1.002 to add an extra margin for error. Test program source is here - compile without optimizations to ensure you get the correct results. I also reproduced the bug by forcing the QPC frequency in the test and in base to 2148289. #include "stdio.h" double x = 9223375311707; double y = 2148289; int main() { double r = (x * 1000000) / y; printf("%.17f\n", r); return 0; }
,
Jun 4 2018
Lowering priority now that I've confirmed that this is just a bug in the test, not a bug in the code being tested.
,
Jun 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9aee77322d2c6087293c7791405153de19a64acd commit 9aee77322d2c6087293c7791405153de19a64acd Author: Bruce Dawson <brucedawson@chromium.org> Date: Tue Jun 05 00:16:25 2018 Increase epsilon to avoid spurious test failures TL;DR - these failures are from a bug/limitation in the test, not a bug in the code being tested. Also, I like it when my floating-point math expertise is actually useful. TimeTicks.FromQPCValue uses double math to verify some integer calculations. Unfortunately, the double calculation involves a multiplication and a division - two roundings - which means that the result can end up being slightly more than 1.0 away from the correct results. I used a combination of Wolfram Alpha and a tiny C++ program to figure out the correct result and the double-rounded result and confirm that an error of 1.00048828125 is actually expected given the QPC frequency on this particular machine. Increasing epsilon a tiny bit seems like the safest and simplest fix. The alternative would be to set epsilon to 1.0 plus the machine epsilon at the range of expected_microseconds_since_origin but at some point that solution becomes too complex and error prone to justify its existence. If failures happen with the slightly increased epsilon then we can consider something more sophisticated. Bug: 786046 Change-Id: I8045128747d72fe5df434905afcf043139f3123c Reviewed-on: https://chromium-review.googlesource.com/1085461 Commit-Queue: Yuri Wiitala <miu@chromium.org> Reviewed-by: Yuri Wiitala <miu@chromium.org> Cr-Commit-Position: refs/heads/master@{#564309} [modify] https://crrev.com/9aee77322d2c6087293c7791405153de19a64acd/base/time/time_win_unittest.cc
,
Jun 5 2018
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by m...@chromium.org
, Nov 16 2017