New issue
Advanced search Search tips

Issue 631018 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug



Sign in to add a comment

test server port reservations not working?

Project Member Reported by pauljensen@chromium.org, Jul 25 2016

Issue description

Version: ToT
OS: Android

I see bots failing with error messages that look like the test server port reservations are not working properly.

In the last week two Cronet bots failed net_unittests with this error:

Traceback (most recent call last):
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/test_runner.py", line 995, in main
    return RunTestsCommand(args)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/test_runner.py", line 818, in RunTestsCommand
    return RunTestsInPlatformMode(args)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/test_runner.py", line 863, in RunTestsInPlatformMode
    args, env, test, infra_error) as test_run:
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/pylib/base/test_run.py", line 34, in __enter__
    self.SetUp()
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/pylib/local/device/local_device_gtest_run.py", line 276, in SetUp
    self._env.parallel_devices.pMap(individual_device_set_up)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/third_party/catapult/devil/devil/utils/parallelizer.py", line 236, in pMap
    r.pFinish(None)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/third_party/catapult/devil/devil/utils/parallelizer.py", line 135, in pFinish
    self._objs.JoinAll()
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 186, in JoinAll
    self._JoinAll(watcher, timeout)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 158, in _JoinAll
    thread.ReraiseIfException()
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/third_party/catapult/devil/devil/utils/reraiser_thread.py", line 81, in run
    self._ret = self._func(*self._args, **self._kwargs)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/pylib/local/device/local_device_test_run.py", line 68, in wrapper
    return f(dev, *args, **kwargs)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/pylib/local/device/local_device_gtest_run.py", line 274, in individual_device_set_up
    step()
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/pylib/local/device/local_device_gtest_run.py", line 264, in init_tool_and_start_servers
    ports.AllocateTestServerPort(), dev, tool))
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/pylib/local/local_test_server_spawner.py", line 16, in __init__
    port, device, tool)
  File "/b/build/slave/Android_Cronet_ARMv6_Builder/build/src/build/android/pylib/chrome_test_server_spawner.py", line 398, in __init__
    SpawningServerRequestHandler)
  File "/usr/lib/python2.7/SocketServer.py", line 420, in __init__
    self.server_activate()
  File "/usr/lib/python2.7/SocketServer.py", line 439, in server_activate
    self.socket.listen(self.request_queue_size)
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 98] Address already in use

https://build.chromium.org/p/chromium.android/builders/Android%20Cronet%20ARMv6%20Builder/builds/2669
https://build.chromium.org/p/chromium.android/builders/Android%20Cronet%20Builder%20%28dbg%29/builds/2636

Assigning to jbudorick@ simply because he made a couple commits to src/third_party/catapult/devil/devil/android/ports.py and might know who is really owner of this code :)
 
Status: Started (was: Untriaged)
I'm an owner here and am likely the one most familiar with this logic, so I'll take this.
Components: Test>Android
It certainly seems to be racily allocating ports. e.g., from the first link in #0:

I   14.161s individual_device_set_up(06948af3003b97d1)  Allocate port 10201 for test server.
I   14.162s individual_device_set_up(0accc2af43e4affc)  Allocate port 10201 for test server.
Project Member

Comment 4 by bugdroid1@chromium.org, Jul 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fdec47a6f1b4e7db662796344a115a72b6c0805f

commit fdec47a6f1b4e7db662796344a115a72b6c0805f
Author: jbudorick <jbudorick@chromium.org>
Date: Tue Jul 26 14:36:00 2016

[Android] Nuke the test server port lockfile explicitly.

ports.ResetTestServerPortAllocation is called in two locations, once in a
multithreaded context in which it's holding the lock file and once in a
singlethreaded context in which it isn't. It currently nukes the lock file,
which is fine in a singlethreaded context but can cause duplicate port
allocation in a multithreaded context. This CL will allow that function
to stop nuking the lock file in a subsequent CL to fix that issue.

BUG= 631018 

Review-Url: https://codereview.chromium.org/2176183004
Cr-Commit-Position: refs/heads/master@{#407791}

[modify] https://crrev.com/fdec47a6f1b4e7db662796344a115a72b6c0805f/build/android/test_runner.py

Project Member

Comment 5 by bugdroid1@chromium.org, Jul 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fdec47a6f1b4e7db662796344a115a72b6c0805f

commit fdec47a6f1b4e7db662796344a115a72b6c0805f
Author: jbudorick <jbudorick@chromium.org>
Date: Tue Jul 26 14:36:00 2016

[Android] Nuke the test server port lockfile explicitly.

ports.ResetTestServerPortAllocation is called in two locations, once in a
multithreaded context in which it's holding the lock file and once in a
singlethreaded context in which it isn't. It currently nukes the lock file,
which is fine in a singlethreaded context but can cause duplicate port
allocation in a multithreaded context. This CL will allow that function
to stop nuking the lock file in a subsequent CL to fix that issue.

BUG= 631018 

Review-Url: https://codereview.chromium.org/2176183004
Cr-Commit-Position: refs/heads/master@{#407791}

[modify] https://crrev.com/fdec47a6f1b4e7db662796344a115a72b6c0805f/build/android/test_runner.py

Project Member

Comment 6 by bugdroid1@chromium.org, Jul 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c1f80b7b9e2d44356f9809894ac401e74e6f8612

commit c1f80b7b9e2d44356f9809894ac401e74e6f8612
Author: catapult-deps-roller <catapult-deps-roller@chromium.org>
Date: Tue Jul 26 17:33:32 2016

Roll src/third_party/catapult/ d24a96773..5eeaae06d (1 commit).

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/d24a96773d20..5eeaae06dc16

$ git log d24a96773..5eeaae06d --date=short --no-merges --format='%ad %ae %s'

BUG= 631018 

TBR=catapult-sheriff@chromium.org

Review-Url: https://codereview.chromium.org/2178043003
Cr-Commit-Position: refs/heads/master@{#407843}

[modify] https://crrev.com/c1f80b7b9e2d44356f9809894ac401e74e6f8612/DEPS

Status: Fixed (was: Started)
Should be fixed now. Please reopen if you see this again.

Sign in to add a comment