New issue
Advanced search Search tips

Issue 786046 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

TimeTicks.FromQPCValue from base_unittests failed on Win10

Project Member Reported by thestig@chromium.org, Nov 16 2017

Issue description

This is a try job for https://chromium-review.googlesource.com/c/chromium/src/+/762853, which doesn't touch base/

https://build.chromium.org/p/tryserver.chromium.win/builders/win10_chromium_x64_rel_ng/builds/28367

[ RUN      ] TimeTicks.FromQPCValue
../../base/time/time_win_unittest.cc(292): error: The difference between expected_microseconds_since_origin and converted_microseconds_since_origin is 512, which exceeds 1.0, where
expected_microseconds_since_origin evaluates to 4.2934311416234112e+18,
converted_microseconds_since_origin evaluates to 4.2934311416234107e+18, and
1.0 evaluates to 1.
ticks=9223372036854775807, to be converted via logic path: SAFE
[  FAILED  ] TimeTicks.FromQPCValue (1 ms)

 

Comment 1 by m...@chromium.org, Nov 16 2017

Wow, that's very weird. I took a look at the test, and it seems to be failing once about every week (i.e., succeeds over hundreds of runs). I also double-checked the code to make sure this wasn't due to uninitialized state in the library.

The only way it could be flaky is if Windows is returning different values for QueryPerformanceFrequency() in the same process. That breaks our assumption in the time library, that it would always return the same value. Will have to investigate further...

Comment 2 by m...@chromium.org, Nov 16 2017

Also, the failures generally are on the same input value. However, there was one flake where it was on a smaller value (but still using the SAFE path):

[ RUN      ] TimeTicks.FromQPCValue
../../base/time/time_win_unittest.cc(292): error: The difference between expected_microseconds_since_origin and converted_microseconds_since_origin is 1.00048828125, which exceeds 1.0, where
expected_microseconds_since_origin evaluates to 4293921788464.0005,
converted_microseconds_since_origin evaluates to 4293921788463, and
1.0 evaluates to 1.
ticks=9225306323878, to be converted via logic path: SAFE


Comment 3 by kbr@chromium.org, Nov 21 2017

Components: Tests>Flaky
Labels: -Pri-3 Pri-2
This is flaking on the tryservers as well:
https://ci.chromium.org/buildbot/tryserver.chromium.win/win10_chromium_x64_rel_ng/31552

However it only seems to have hit my CL ( https://chromium-review.googlesource.com/781206/ ) and hasn't shown up in the last 200 builds. Upgrading to P2 rather than P1 for flaky tests.

Comment 4 by treib@chromium.org, Dec 7 2017

Cc: fmea...@chromium.org
 Issue 789939  has been merged into this issue.

Comment 5 by treib@chromium.org, Dec 7 2017

Labels: -Pri-2 Pri-1
This hit me as well. Logs look like in the original report.

Upgrading to P1 again, since it's been flaking quite a lot recently: https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIQsSBUZsYWtlIhZUaW1lVGlja3MuRnJvbVFQQ1ZhbHVlDA
Cc: brucedaw...@chromium.org
I am wondering if there was a hardware or an OS change, this code has not been changed for over 2 years.

+brucedawson for thoughts
Owner: brucedaw...@chromium.org
Status: Assigned (was: Untriaged)
This is a bug in EXPECT_NEAR.

The main clue is that the difference is 512 - a suspiciously perfect power-of-two. At the range of those numbers it turns out that 512 is the smallest possible difference between two doubles. The two numbers are:
  4293431141623411200.0
  4293431141623410688.0
and there are no doubles between them. Their integer representations are:
  4885784004140730392
  4885784004140730393
At that range if you do two different calculations that should give a similar result then the results will either be identical, or will differ by some multiple of 512.

So, the initial problem is a bug in EXPECT_NEAR. It should treat *adjacent* doubles as near, even if the absolute value of their difference is huge. I've blogged about this extensively:
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
To reiterate, it makes no sense for EXPECT_NEAR to fail on two doubles that are literally as close to identical as possible without actually being identical.

If QueryPerformanceFrequency is 4 GHz then it would take 34 years to get a time value that large. I'm tempted to retitle this bug to "TimeTicks.FromQPCValue fails if system up-time exceeds several decades".

It looks like the only external dependency in this code is the result of QueryPerformanceFrequency. This varies between machines and there must be some values that cause the two results to fall on either side of a rounding cusp (half way between the two answers) when handling std::numeric_limits<int64_t>::max().

I'll take this bug. I'll modify the EXPECT_NEAR statement to print frequency so that I can reproduce the issue, then I'll fix EXPECT_NEAR.

crrev.com/c/816053 should fix it - I'll tidy it up later.
Project Member

Comment 9 by bugdroid1@chromium.org, Dec 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6c2427457b0c5ebaefa5c1a6003117ca8126e7bc

commit 6c2427457b0c5ebaefa5c1a6003117ca8126e7bc
Author: Bruce Dawson <brucedawson@chromium.org>
Date: Fri Dec 08 21:58:50 2017

Fix epsilon calculation for large-double comparisons

My whole life has been leading up to this bug fix.

TimeTicks.FromQPCValue was failing on some machines and this turned out
to be caused by a bad epsilon value when comparing two floating-point
numbers. I have written some of the reference pages about the perils of
floating-point comparisons so this bug really spoke to me:
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

The problem is that when testing with a QPC value of int64_t max and
a QPC frequency of 2.149252 MHz the result is around 4.29e18, and at
that magnitude the precision of a double is 512.0. The test was using an
epsilon of 1.0 for its EXPECT_NEAR comparison and this epsilon is
meaningless for numbers of that magnitude - either the doubles will be
identical or they will differ by a multiple of 512.0.

The real-life implications of this bug are that if you run Chrome on a
machine with an uptime of 136 millennia then if you store
TimeTicks::FromQPCValue in a double you should expect less than half a
millisecond of precision. I guess I should update this article to warn
about the risks of using double:
https://randomascii.wordpress.com/2012/02/13/dont-store-that-in-a-float/

The fix is to calculate the actual minimum epsilon at the magnitude of
the numbers and use the maximum of that and 1.0 as the epsilon parameter
to EXPECT_NEAR.

I have a tentative fix to DoubleNearPredFormat so that EXPECT_NEAR will
fail if passed an epsilon value that is meaninglessly small - this change
would have detected the bad test:
https://github.com/google/googletest/issues/1350

Bug:  786046 
Change-Id: I92ee56309a0cab754dee97e11651ae12547a348e
Reviewed-on: https://chromium-review.googlesource.com/816053
Commit-Queue: Bruce Dawson <brucedawson@chromium.org>
Reviewed-by: Yuri Wiitala <miu@chromium.org>
Cr-Commit-Position: refs/heads/master@{#522894}
[modify] https://crrev.com/6c2427457b0c5ebaefa5c1a6003117ca8126e7bc/base/time/time_win_unittest.cc

Status: Fixed (was: Assigned)
Fixed.

And I'm refining a fix for https://github.com/google/googletest/issues/1350. The fix turns the failure we were seeing into a 100% reliable failure that says:

../../base/time/time_win_unittest.cc(292): error: The epsilon used to compare expected_microseconds_since_origin and converted_microseconds_since_origin is less than the minimum epsilon in this range, which is 512, where
expected_microseconds_since_origin evaluates to 3.640989196237484e+18,
converted_microseconds_since_origin evaluates to 3.640989196237484e+18, and
1.0 evaluates to 1.
ticks=9223372036854775807, to be converted via logic path: SAFE

Status: Available (was: Fixed)
This (or a related issue) seems to be still failing.

brucedawson: Can you take a look?

https://logs.chromium.org/v/?s=chromium%2Fbuildbucket%2Fcr-buildbucket.appspot.com%2F8945160291103429904%2F%2B%2Fsteps%2Fbase_unittests_on_Windows-10-15063%2F0%2Flogs%2FTimeTicks.FromQPCValue%2F0

 RUN      ] TimeTicks.FromQPCValue
../../base/time/time_win_unittest.cc(301): error: The difference between expected_microseconds_since_origin and converted_microseconds_since_origin is 1.00048828125, which exceeds epsilon, where
expected_microseconds_since_origin evaluates to 4293358720222.0005,
converted_microseconds_since_origin evaluates to 4293358720221, and
epsilon evaluates to 1.
ticks=9223375311707, to be converted via logic path: SAFE
Stack trace:
Backtrace:
	StackTraceGetter::CurrentStackTrace [0x00007FF6FDB84F20+80]
	testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop [0x00007FF6FDB9DC0A+90]
	testing::internal::AssertHelper::operator= [0x00007FF6FDB9D70A+90]
	base::TimeTicks_FromQPCValue_Test::TestBody [0x00007FF6FD8F29FC+1980]
[  FAILED  ] TimeTicks.FromQPCValue (23 ms)
Before I made my epsilon fix (crrev.com/c/816053) we were always passing an epsilon value of 1.0. Now we pass an epsilon value of 1.0 or greater. This failure is happening when the epsilon value is 1.0 so I see no reason why this test should be failing now if it was never failing before. I've confirmed that my change did nothing but edit comments and allow for a larger value of epsilon, so therefore the test was always failing before - with this particular set of input parameters, in particular with this particular QPC frequency (2148289).

I will assume that it was failing before, or at least it was always vulnerable to failing. And, that makes sense. It is comparing the results of two calculations, one of which is a perfectly calculated integral value, and the other which is a double-precision number with two operations, so it may not be perfectly rounded.

The correct answer, given that ticks is 9223375311707 and ticks_per_second is 2148289 is 4293358720221.9999264530982563333 - according to the Windows calculator and Wolfram Alpha. But, with intermediate rounding to double precision the answer is 4293358720222.00048828125 (verified with a tiny program). So, with the inputs that we are getting on this particular machine this failure is expected given the current epsilon values, despite there being no bug in the conversion code being tested.

So...

The integral result is correct (assuming it is supposed to truncate). The floating-point result is a tiny bit high and just happens to push over the cusp. So it's not a bug, it's just a slightly too tight epsilon. I don't have any intuition for what the correct and safe epsilon would be to avoid this, and I'm not sure it deserves a detailed analysis. A minimum epsilon of 1.001 would avoid this particular glitch. I'm going to try 1.002 to add an extra margin for error.

Test program source is here - compile without optimizations to ensure you get the correct results. I also reproduced the bug by forcing the QPC frequency in the test and in base to 2148289.

#include "stdio.h"

double x = 9223375311707;
double y = 2148289;

int main()
{
  double r = (x * 1000000) / y;
  printf("%.17f\n", r);
  return 0;
}

Labels: -Pri-1 Pri-3
Status: Started (was: Available)
Lowering priority now that I've confirmed that this is just a bug in the test, not a bug in the code being tested.
Project Member

Comment 14 by bugdroid1@chromium.org, Jun 5 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9aee77322d2c6087293c7791405153de19a64acd

commit 9aee77322d2c6087293c7791405153de19a64acd
Author: Bruce Dawson <brucedawson@chromium.org>
Date: Tue Jun 05 00:16:25 2018

Increase epsilon to avoid spurious test failures

TL;DR - these failures are from a bug/limitation in the test, not a bug
in the code being tested.

Also, I like it when my floating-point math expertise is actually
useful.

TimeTicks.FromQPCValue uses double math to verify some integer
calculations. Unfortunately, the double calculation involves a
multiplication and a division - two roundings - which means that the
result can end up being slightly more than 1.0 away from the correct
results. I used a combination of Wolfram Alpha and a tiny C++ program
to figure out the correct result and the double-rounded result and
confirm that an error of 1.00048828125 is actually expected given the
QPC frequency on this particular machine.

Increasing epsilon a tiny bit seems like the safest and simplest fix.
The alternative would be to set epsilon to 1.0 plus the machine epsilon
at the range of expected_microseconds_since_origin but at some point
that solution becomes too complex and error prone to justify its
existence. If failures happen with the slightly increased epsilon then
we can consider something more sophisticated.

Bug:  786046 
Change-Id: I8045128747d72fe5df434905afcf043139f3123c
Reviewed-on: https://chromium-review.googlesource.com/1085461
Commit-Queue: Yuri Wiitala <miu@chromium.org>
Reviewed-by: Yuri Wiitala <miu@chromium.org>
Cr-Commit-Position: refs/heads/master@{#564309}
[modify] https://crrev.com/9aee77322d2c6087293c7791405153de19a64acd/base/time/time_win_unittest.cc

Status: Fixed (was: Started)

Sign in to add a comment