New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 664337 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Nov 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 1
Type: Bug



Sign in to add a comment

Android test runner crashes looking up failed test

Project Member Reported by katthomas@chromium.org, Nov 10 2016

Issue description

https://luci-milo.appspot.com/buildbot/tryserver.chromium.android/linux_android_rel_ng/177086
https://luci-milo.appspot.com/buildbot/tryserver.chromium.android/linux_android_rel_ng/176943

Both builds have failing (crashed/timeout) tests. In the rerun-without-patch step, during tear down, we get a key error looking for one of the failed tests, e.g.

Traceback (most recent call last):
  File "/b/swarm_slave/w/irYXwBOy/build/android/test_runner.py", line 869, in main
    return RunTestsCommand(args)
  File "/b/swarm_slave/w/irYXwBOy/build/android/test_runner.py", line 693, in RunTestsCommand
    return RunTestsInPlatformMode(args)
  File "/b/swarm_slave/w/irYXwBOy/build/android/test_runner.py", line 760, in RunTestsInPlatformMode
    raw_results = test_run.RunTests()
  File "/b/swarm_slave/w/irYXwBOy/build/android/pylib/local/device/local_device_test_run.py", line 55, in RunTests
    tests = self._GetTests()
  File "/b/swarm_slave/w/irYXwBOy/build/android/pylib/local/device/local_device_gtest_run.py", line 345, in _GetTests
    test_lists = self._env.parallel_devices.pMap(list_tests).pGet(None)
  File "/b/swarm_slave/w/irYXwBOy/third_party/catapult/devil/devil/utils/parallelizer.py", line 236, in pMap
    r.pFinish(None)
  File "/b/swarm_slave/w/irYXwBOy/third_party/catapult/devil/devil/utils/parallelizer.py", line 135, in pFinish
    self._objs.JoinAll()
  File "/b/swarm_slave/w/irYXwBOy/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 186, in JoinAll
    self._JoinAll(watcher, timeout)
  File "/b/swarm_slave/w/irYXwBOy/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 158, in _JoinAll
    thread.ReraiseIfException()
  File "/b/swarm_slave/w/irYXwBOy/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 81, in run
    self._ret = self._func(*self._args, **self._kwargs)
  File "/b/swarm_slave/w/irYXwBOy/build/android/pylib/local/device/local_device_environment.py", line 55, in wrapper
    return f(dev, *args, **kwargs)
  File "/b/swarm_slave/w/irYXwBOy/build/android/pylib/local/device/local_device_gtest_run.py", line 341, in list_tests
    tests = self._test_instance.FilterTests(tests)
  File "/b/swarm_slave/w/irYXwBOy/build/android/pylib/gtest/gtest_test_instance.py", line 429, in FilterTests
    filtered_test_list, gtest_filter_string)
  File "/b/swarm_slave/w/irYXwBOy/build/util/lib/common/unittest_util.py", line 144, in FilterTestNames
    if (fnmatch.fnmatch(test, pattern)
  File "/usr/lib/python2.7/fnmatch.py", line 43, in fnmatch
    return fnmatchcase(name, pat)
  File "/usr/lib/python2.7/fnmatch.py", line 79, in fnmatchcase
    return _cache[pat].match(name) is not None
KeyError: 'SitePerProcessBrowserTest.NavigationHandleSiteInstance'

This crashes the test runner, resulting in an infra failure.

 
Owner: mikec...@chromium.org
Status: Assigned (was: Available)
Case, can you look into this?
Ping
Status: Started (was: Assigned)
hmmm, the bot logs no longer exist and I dont see this issue currently showing up. Anyone know how to repro this (I'm guessing not  :/ )
So root cause for this seems to be python fnmatch module is not threadsafe.

We create a thread for each device. Then we have each device return a list of gtests and filter the test list on each thread (using fnmatch).

Should be a simple fix. Looking into coding it up.
Seems like someone made a change to make fnmatch threadsafe.
https://hg.python.org/cpython/rev/fe12c34c39eb

idk really how to determine what version of python this change made it into though. Probably safe to assume fnmatch isnt threadsafe on the bots if they have older versions of python.

Project Member

Comment 8 by bugdroid1@chromium.org, Nov 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c37f55cdd68e4715e9db17ede0a5ba7f4464ab8c

commit c37f55cdd68e4715e9db17ede0a5ba7f4464ab8c
Author: mikecase <mikecase@chromium.org>
Date: Fri Nov 18 20:59:22 2016

Fix KeyError caused when filtering gtests on multiple threads.

BUG= 664337 

Review-Url: https://codereview.chromium.org/2509623003
Cr-Commit-Position: refs/heads/master@{#433295}

[modify] https://crrev.com/c37f55cdd68e4715e9db17ede0a5ba7f4464ab8c/build/android/pylib/gtest/gtest_test_instance.py

Status: Fixed (was: Started)
Fairly certain that this is fixed now. 
Labels: Infra-Failures
Labels: Hotlist-Infra-Failures

Sign in to add a comment