New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 862602 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 20
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: iOS , Chrome , Mac , Fuchsia
Pri: 1
Type: Bug



Sign in to add a comment

UDPSocketTest.* tests sometimes flake out because port 10000 is in-use

Project Member Reported by w...@chromium.org, Jul 11

Issue description

On the Fuchsia x64 FYI bot in https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/fuchsia-fyi-x64-rel/266 we got:

[ RUN      ] UDPSocketTest.VerifyConnectBindsAddr
../../net/socket/udp_socket_unittest.cc:438: Failure
Value of: rv
Expected: net::OK
  Actual: -147, net::ERR_ADDRESS_IN_USE
Stack trace:
#00: testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop(int) at gtest.cc:?
#01: testing::internal::AssertHelper::operator=(testing::Message const&) const at gtest.cc:?
#02: net::(anonymous namespace)::UDPSocketTest_VerifyConnectBindsAddr_Test::TestBody() at udp_socket_unittest.cc:?

[  FAILED  ] UDPSocketTest.VerifyConnectBindsAddr (15 ms)
[5073/27679] UDPSocketTest.VerifyConnectBindsAddr (15 ms)
...
[5083/27679] UDPSocketTest.TestBindToNetwork (9 ms)
[ RUN      ] UDPSocketTest.ReadWithSocketOptimization
../../net/socket/udp_socket_unittest.cc:887: Failure
Value of: rv
Expected: net::OK
  Actual: -147, net::ERR_ADDRESS_IN_USE
Stack trace:
#00: testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop(int) at gtest.cc:?
#01: testing::internal::AssertHelper::operator=(testing::Message const&) const at gtest.cc:?
#02: net::UDPSocketTest_ReadWithSocketOptimization_Test::TestBody() at udp_socket_unittest.cc:?

[  FAILED  ] UDPSocketTest.ReadWithSocketOptimization (10 ms)
[5084/27679] UDPSocketTest.ReadWithSocketOptimization (10 ms)
[ RUN      ] UDPSocketTest.ReadWithSocketOptimizationTruncation
../../net/socket/udp_socket_unittest.cc:929: Failure
Value of: rv
Expected: net::OK
  Actual: -147, net::ERR_ADDRESS_IN_USE
Stack trace:
#00: testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop(int) at gtest.cc:?
#01: testing::internal::AssertHelper::operator=(testing::Message const&) const at gtest.cc:?
#02: net::UDPSocketTest_ReadWithSocketOptimizationTruncation_Test::TestBody() at udp_socket_unittest.cc:?

[00139.083] pkgsvr: 2018/07/11 14:34:27 pkgfs:unsupported(/packages/net_unittests/0): dir unlink "app.log"
[  FAILED  ] UDPSocketTest.ReadWithSocketOptimizationTruncation (12 ms)

These are all port-in-use errors for port number 10,000, and the failures follow immediately after the ConnectRandomBind test, suggesting that perhaps it is managing to leak that port number, and that seems to be a consistent property across the failure logs.
 
Cc: thakis@chromium.org
thakis FYI :)
Cc: jam...@chromium.org
It seems that something about the ConnectRandomBind test, or more likely about the RANDOM_BIND implementation, causes Netstack to run out of resources. If we run the test with gtest_repeat=1000 then eventually it will crash out with an ENOMEM from the network stack.
this may be same issue as in NET-1077
Owner: w...@chromium.org
Status: Started (was: Untriaged)
This appears to be specific to:

1. bind()ing to INADDR_ANY on a specific port.
2. connect()ing to a target address.
3. close()ing the socket.
4. Attempting to bind() to localhost on the same port.

Following this sequence, #4 will fail; skipping step #1 or step #2, or binding it to localhost rather than INADDR_ANY, will work fine.

Filed NET-1138 for that.
Labels: -Pri-2 Pri-1
Labels: OS-Chrome
These tests also (much more rarely) flake on ChromeOS bots (see https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux-chromeos-rel/45507 for a recent example), with address-in-use errors, presumably due to multiple test batch sub-processes attempting to bind the same hard-wired ports on INADDR_ANY.

Under Fuchsia it is not possible to bind twice to the same port number even from the same process (without closing the original socket), though that does appear to work under Linux, via SO_REUSEADDR.

According to the Linux socket() man-page (and a comment in the //net SetReuseAddr() impl), there is a separate SO_REUSEPORT option which can be used to bind to exactly the same address+port combination more than once.
Labels: OS-iOS OS-Mac
Having added a test to bind to the same port twice (see https://chromium-review.googlesource.com/c/chromium/src/+/1140360), Mac and iOS failed it, suggesting that these tests will be flaky under those platforms as well, if two test batches happen to run hard-wired-port bind tests at the same time.
Labels: -M-69 M-70
Status: Assigned (was: Started)
Labels: -Pri-1 -M-70 M-72 Pri-2
UDPSocketTest.MulticastOptions just flaked, in https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/fuchsia-fyi-x64-rel/6390 with ERR_ADDRESS_IN_USE - it, too, expects to be able to bind to a hard-wired port # (9999), but it doesn't request that port re-use be allowed. Several other tests use the same hard-wired port number.
Labels: -Pri-2 -M-72 M-73 Pri-1
Owner: jkarlin@chromium.org
jkarlin: Picking on you because //net/OWNERS - please redirect if there is a more suitable owner.

We have a number of net_unittests that expect to be able to bind() to a hard-wired port, even though in general that (a) may fail because of other tests running concurrently or (b) may succeed even though another test is also bound to that port, but then lead to erroneous results.

I wonder if we could add a command-line option set by TestLauncher to provide a "shard Id" to each test sub-process, which these tests could use to ensure isolation of the port namespace, for example?
Owner: mmenke@chromium.org
Over to mmenke to reassign...
Cc: rch@chromium.org zhongyi@chromium.org
[+rch, +zhongyi]:  Could one of you take this?  I assume you two are a bit more familiar with UDP API than the average team member, and looks like at least two of these tests were added with QUIC in mind.
Status: Started (was: Assigned)
Think I'll just do this myself.
Project Member

Comment 16 by bugdroid1@chromium.org, Dec 20

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/bdcddfdb217382e0743f06f70ecc966b373cdf58

commit bdcddfdb217382e0743f06f70ecc966b373cdf58
Author: Matt Menke <mmenke@chromium.org>
Date: Thu Dec 20 18:51:29 2018

Remove use of hard-coded UDP ports in UDPSocketTests.

Using hard coded ports in tests in general isn't great, as there's no
way to be sure something else isn't already using them. It's even worse
when all tests are using the same two ports, and are run in parallel.

Bug:  862602 
Change-Id: If1643aad0207ff492a7874e408973600d4be19e9
Reviewed-on: https://chromium-review.googlesource.com/c/1385017
Reviewed-by: Eric Roman <eroman@chromium.org>
Commit-Queue: Matt Menke <mmenke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#618280}
[modify] https://crrev.com/bdcddfdb217382e0743f06f70ecc966b373cdf58/net/socket/udp_socket_unittest.cc

Status: Fixed (was: Started)

Sign in to add a comment