multicast messages not always shared |
|||||
Issue descriptionReported by qingsi@chromium.org while working on MDNS responder support: "mDNS packets are not received by all sockets joined the group. I found the packets are always received by the mDNS sockets of the service discovery client, not the ones of the mDNS client created by the mDNS task. I had to comment out the service discovery client to allow the mDNS task to receive a packet. It looks like we have some issue in the existing mDNS socket infra to configure reusable address, or there may be a platform problem." Later confirmed his test environment to be Linux. From the description, determined that the MDNS packets were not being sent to all the sockets attempting to listen for them. This is the typical behavior for sockets where there's a single handler processing some particular endpoint and anybody else listening in is a security issue. But in multicast, where packets are sent to any machine that wants them, it naturally makes more sense for multiple sockets and processes on the machine to listen for and get the same messages. Summary of conclusion after research: We should set SO_REUSEPORT for multicast sockets whenever able, and non-Windows multicast sockets should bind to the multicast group address. Research notes: Windows documentation on the topic is pretty clear (https://docs.microsoft.com/en-us/windows/desktop/winsock/using-so-reuseaddr-and-so-exclusiveaddruse): Sockets can bind for the same endpoint if they set SO_REUSEADDR, and while that doesn't normally let them get all messages, they do if they have joined the same multicast group. This matched the current implementation. Non-windows had no good singular, citable documentation on the topic. Other platforms sometimes (but not always) have an SO_REUSEPORT option that generally allows messages to be sent to all the sockets (not a multicast specific feature). Some documentation/tutorials/stackoverflows (nothing really definitive that I could find) says this should be set for multicast sockets, others conflictingly say SO_REUSEADDR implies SO_REUSEPORT for multicast. The definition of what makes the socket "for multicast" to get the behavior also varies by reference, sometimes saying messages arriving on the machine via multicast get shared thusly, sometimes sockets joining multicast groups (similar to the Windows behavior), sometimes saying sockets bound to multicast addresses. Lots of confusion and variation between platforms. Since SO_REUSEPORT will either do what we want, be unnecessary but not bad, or not exist at all, setting it for multicast sockets whenever it exists seems to be strictly better than our current implementation and give us the highest chance on any platform of multicast messages being shared. A similar researched topic is what address should be used for the socket bind: a wildcard (0.0.0.0) as was the current implementation, or the multicast address (e.g. 224.0.0.251 for MDNS). Lots of documentation/tutorials/stackoverflows (again, nothing really definitive or singularly citable), especially referring to non-Windows platforms, indicate that joining multicast groups is more of a system-wide thing and that any sockets could then get multicast messages for any group joined by any socket on the system rather than just messages for the group that specific socket joined. The way to filter for just the specific group is to bind the socket to that group address. So, combined with the potential that binding to a multicast address may make the platform more likely to share multicast messages (see paragraph above), we should be binding to the specific multicast group address rather than the wildcard. But Windows does not allow binding a socket to multicast addresses (experimentally confirmed it always returns WSAEADDRNOTAVAIL), and no references clearly in the context of Windows mention this system-wide behavior of multicast group joining, especially not Microsoft's documentation. Everything I see implies that it works as one would expect: only sending multicast messages to sockets that have joined the relevant group. So, for Windows, we should keep binding to the 0.0.0.0 wildcard address.
,
Oct 26
,
Oct 29
,
Nov 2
Reopening. qingsi@ reported some issues on OSX with *sending* multicast messages through a socket bound to the multicast address (vs things working fine if the socket is bound to the wildcard address). Theories are that the OS is getting too smart with filtering out messages sent to an address the socket is also listening on, or something is up with the multicast loop functionality (that should be enabled by default). We're experimenting with it. Worst-case, I think our MDNS code could use a separate socket for sending that could just be a simple unbound UDPClientSocket. Quite reasonable in MDNS where queries and responses are deliberately not tightly coupled. Prefer not to just change the bound address to improve sending since we changed that to improve receiving.
,
Nov 14
,
Jan 11
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/eb994032b05b9c3093c681e0d13b74739b552001 commit eb994032b05b9c3093c681e0d13b74739b552001 Author: Qingsi Wang <qingsi@google.com> Date: Fri Jan 11 02:36:14 2019 Add SetMulticastInterface to DatagramClientSocket. Bug: 899310 Change-Id: Ia3290e2d2506907704ea0d8cac3cd2674eccd812 Reviewed-on: https://chromium-review.googlesource.com/c/1392601 Reviewed-by: Eric Orth <ericorth@chromium.org> Commit-Queue: Qingsi Wang <qingsi@google.com> Cr-Commit-Position: refs/heads/master@{#621876} [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/dns/address_sorter_posix_unittest.cc [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/proxy_resolution/pac_library_unittest.cc [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/datagram_client_socket.h [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/fuzzed_datagram_client_socket.cc [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/fuzzed_datagram_client_socket.h [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/socket_test_util.cc [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/socket_test_util.h [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/udp_client_socket.cc [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/udp_client_socket.h [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/udp_socket_posix.cc [modify] https://crrev.com/eb994032b05b9c3093c681e0d13b74739b552001/net/socket/udp_socket_win.cc |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by bugdroid1@chromium.org
, Oct 26