New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 692219 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android
Pri: 1
Type: Bug


Show other hotlists

Hotlists containing this issue:
Gamepad


Sign in to add a comment

"GamepadProviderTest.ConnectDisconnectMultiple" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, Feb 14 2017

Issue description

"GamepadProviderTest.ConnectDisconnectMultiple" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 4 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOAsSBUZsYWtlIi1HYW1lcGFkUHJvdmlkZXJUZXN0LkNvbm5lY3REaXNjb25uZWN0TXVsdGlwbGUM.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
 

Comment 1 by xlai@chromium.org, Feb 14 2017

Components: -Tests>Flaky IO>Gamepad
Labels: -Sheriff-Chromium OS-Android
Owner: aelias@chromium.org
Status: Assigned (was: Untriaged)
aelias@: Your recent changes to GamepadProviderTest.ConnectDisconnectMultiple is coincident with the time when this test becomes flaky. Can you pls investigate?

Comment 2 by aelias@chromium.org, Feb 14 2017

Flake log:

I   21.889s run_tests_on_device(0cbc7389032fddae)  [ RUN      ] GamepadProviderTest.ConnectDisconnectMultiple
I   21.889s run_tests_on_device(0cbc7389032fddae)  ../../device/gamepad/gamepad_provider_unittest.cc:165: Failure
I   21.889s run_tests_on_device(0cbc7389032fddae)  Value of: output.items[0].axesLength
I   21.890s run_tests_on_device(0cbc7389032fddae)    Actual: 2
I   22.642s run_tests_on_device(0cbc67090c371bb4)  >>ScopedMainEntryLogger
I   22.642s run_tests_on_device(0cbc7389032fddae)  Expected: 0u
Here's another example of a failure on Android Tests:
https://uberchromegw.corp.google.com/i/chromium.linux/builders/Android%20Tests%20%28dbg%29/builds/40441

It looks like this test has been really flaky on other platforms as well, but it's only failed all three tries on Android Tests and Android Tests (dbg):
https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=device_unittests&tests=GamepadProviderTest.ConnectDisconnectMultiple
Project Member

Comment 4 by bugdroid1@chromium.org, Mar 2 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d9e73bdaff5e85473029bd1dc24a03926ba8138c

commit d9e73bdaff5e85473029bd1dc24a03926ba8138c
Author: aelias <aelias@chromium.org>
Date: Thu Mar 02 20:06:11 2017

Disable GamepadProviderTest.ConnectDisconnectMultiple.

This test is flaky on all platforms.  Disable pending investigation.

TBR=bajones
NOTRY=true
BUG= 692219 

Review-Url: https://codereview.chromium.org/2726843005
Cr-Commit-Position: refs/heads/master@{#454357}

[modify] https://crrev.com/d9e73bdaff5e85473029bd1dc24a03926ba8138c/device/gamepad/gamepad_provider_unittest.cc

Project Member

Comment 5 by chromium...@appspot.gserviceaccount.com, Mar 3 2017

Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "GamepadProviderTest.ConnectDisconnectMultiple". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyOAsSBUZsYWtlIi1HYW1lcGFkUHJvdmlkZXJUZXN0LkNvbm5lY3REaXNjb25uZWN0TXVsdGlwbGUM. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium

Comment 7 by aelias@chromium.org, Mar 24 2017

Cc: bajones@chromium.org
No luck reproducing this locally on a recent build with Nexus 6P/ARM/Release nor Ubuntu/Release.  I suspect there's a longstanding flake cause dancing around all the gamepad tests.  The latest variant seems to be  http://crbug.com/702712 , I'll see if I can repro that one.
Cc: aelias@chromium.org
Owner: mattreynolds@chromium.org
Status: Started (was: Assigned)
Labels: OS-Linux
I saw one failure in 1000 runs on Linux (debug build, tip of tree). I built using the gn args aelias@ listed in https://bugs.chromium.org/p/chromium/issues/detail?id=702712#c4
Another 1/1000 failure, this time I captured the log:

[ RUN      ] GamepadProviderTest.ConnectDisconnectMultiple
../../device/gamepad/gamepad_provider_unittest.cc:169: Failure
Value of: output.items[0].axes_length
  Actual: 2
Expected: 0u
Which is: 0
[  FAILED  ] GamepadProviderTest.ConnectDisconnectMultiple (53 ms)
This is easier to repro if you add a log message in GamepadProvider::DoPoll after calling GetGamepadData on the fetchers but before calling gamepad_shared_buffer_->WriteBegin().

It appears that this is a testing issue. In the test, MockGamepadDataFetcher::WaitForDataReadAndCallbacksIssued is used to detect when the provider has finished polling the GamepadDataFetchers for new data. However, there is a small window between the fetcher's GetGamepadData being called (which is what WaitForDataRead waits for) and the data actually being written to the shared buffer. It's possible for the test case to read the buffer before it has been written.

It's also possible for this to happen when the provider is first created -- sometimes the first call to ReadGamepadHardwareBuffer will complete before the provider finishes its initial poll. I think this is responsible for our other GamepadProviderTest flakes.
Project Member

Comment 12 by bugdroid1@chromium.org, Apr 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d5427aa537bb0f921a2dda161bb3b555a0e6da6c

commit d5427aa537bb0f921a2dda161bb3b555a0e6da6c
Author: mattreynolds <mattreynolds@chromium.org>
Date: Fri Apr 14 18:30:41 2017

Fix race condition in flaky GamepadProvider tests

GamepadProvider polls GamepadDataFetchers on a background thread, then
copies the gamepad state into a shared memory buffer. Unit tests need to
know when this buffer has been updated to verify that the new state is
correct and check for callbacks. MockGamepadDataFetcher signals when its
state is queried, which is used as an indicator that the buffer may have
new data.

The buffer is not written until after gamepad state is read from the
fetchers, creating a small window between the MockGamepadDataFetcher signal
and the actual buffer write. Rarely, the main thread would read from the
buffer before the provider could write to it, causing test failures.

Instead of relying on MockGamepadDataFetcher, the tests will now wait for
the version number of the shared memory buffer's seqlock to advance before
continuing. This ensures that the buffer has been written to at least once.

BUG= 692219 

Review-Url: https://codereview.chromium.org/2820563003
Cr-Commit-Position: refs/heads/master@{#464760}

[modify] https://crrev.com/d5427aa537bb0f921a2dda161bb3b555a0e6da6c/device/gamepad/gamepad_provider_unittest.cc

Status: Fixed (was: Started)
I think this is fixed and re-enabled the test on all platforms. I ran all GamepadProviderTest unittests 1000x on Linux, OSX, and Android and saw no failures.  Please reopen if the test is still flaky!
Cool, thanks for tracking this down!
Components: -IO>Gamepad Blink>GamepadAPI

Sign in to add a comment