New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 849872 link

Starred by 13 users

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocking:
issue 821607



Sign in to add a comment

Kernel 3.18: hangs in netlink recv()

Project Member Reported by briannorris@chromium.org, Jun 5 2018

Issue description

Forked from the centi-bug:

https://bugs.chromium.org/p/chromium/issues/detail?id=821607

It's probably easier to have real discussion away from that bug.

The current best guess at a smoking gun is that recv() is hanging:

https://bugs.chromium.org/p/chromium/issues/detail?id=821607&cnum=100&cstart=46#c23

which should not happen.

This suggests we might have backported some patches that introduced broken netlink behaviors:

https://bugs.chromium.org/p/chromium/issues/detail?id=821607#c124
https://groups.google.com/forum/#!topic/fa.linux.kernel/zoWnuxWdJFk
https://bugs.chromium.org/p/chromium/issues/detail?id=821607#c125

We're not likely to back out those changes entirely, since they're useful for some other upgrades (rhashtable?). So instead, we should see what else is needed in order to avoid these recv() hangs.

I'll take a look at the suggested backports. Any bonus tips on reproducing this would be helpful.

Side ramblings: if we really think this was problematic:

commit 223b5942f70a8dea2149e1d77f21922cda676249
Author: Ying Xue <ying.xue@windriver.com>
Date:   Mon Jan 12 14:52:23 2015 +0800

    UPSTREAM: netlink: eliminate nl_sk_hash_lock
    
    As rhashtable_lookup_compare_insert() can guarantee the process
    of search and insertion is atomic, it's safe to eliminate the
    nl_sk_hash_lock. After this, object insertion or removal will
    be protected with per bucket lock on write side while object
    lookup is guarded with rcu read lock on read side.
    
    Signed-off-by: Ying Xue <ying.xue@windriver.com>
    Cc: Thomas Graf <tgraf@suug.ch>
    Acked-by: Thomas Graf <tgraf@suug.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    (cherry picked from commit c5adde9468b0714a051eac7f9666f23eb10b61f7)
    
    BUG= chromium:556861 
    TEST=builds all four kernel flavors based on chromeos-3.18 branch
    
    Change-Id: I76e16b49ae02b68217edf3d543d1a593122f946d
    Signed-off-by: Grant Grundler <grundler@chromium.org>
    Reviewed-on: https://chromium-review.googlesource.com/314154

then shouldn't we have seen problems *long* ago? That landed in our tree in December 2015.
 
Brian, correct: rhashtable was updated to support wireless-4.2 stack for . Backing that out wireless-4.2 isn't likely.

I also agree we should have seen this long ago. What should I be looking for on the "kernel side" stack trace that corresponds to __libc_recv?

Regarding "eliminate nl_sk_hash_lock" patch: I expect this would have been reported and fixed upstream by now...and indeed there is at least one patch related to this:
commit c0bb07df7d981e4091432754e30c9c720e2c0c78
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Sat May 16 21:50:28 2015 +0800

    netlink: Reset portid after netlink_insert failure
    
    The commit c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink:
    eliminate nl_sk_hash_lock") breaks the autobind retry mechanism
    because it doesn't reset portid after a failed netlink_insert.
    
    This means that should autobind fail the first time around, then
    the socket will be stuck in limbo as it can never be bound again
    since it already has a non-zero portid.
    
    Fixes: c5adde9468b0 ("netlink: eliminate nl_sk_hash_lock")
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Though I don't know offhand if that was backported already or we should include it now.
I don't know why this starts to happen recently. But might be related to chrome starting to using more threads last year (switching to task scheduler threads instead browser threads), as the problem is easier to happen with multi-threaded app.

The kernel discussion has code to repro the problem.
https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Similar to Kevin's test code in https://bugs.chromium.org/p/chromium/issues/detail?id=821607#c85, but using lots of threads.

> What should I be looking for on the "kernel side" stack trace that corresponds to __libc_recv?

I'm definitely not an expert on this yet, but my understanding so far comes from comments like this one:

https://bugs.chromium.org/p/chromium/issues/detail?id=821607#c76

<quote>
Essentially, the ReadMessages() function will explicitly block on the first iteration of the recv() loop. So if there is nothing in the socket, we will wait forever. In particular, netlink packets can be dropped by the kernel under memory pressure conditions, so this code definitely has the ability to block.
</quote>

So the kernel won't actually think anything is abnormal; the caller asked for a blocking recv(), and if there's nothing to deliver (because we already dropped it)...it just patiently waits forever.

I guess that means the kernel would be looping in:

netlink_recvmsg() -> skb_recv_datagram() -> __skb_recv_datagram()

with a timeout of MAX_SCHEDULE_TIMEOUT (unless someone set a smaller SO_RCVTIMEO?).

BTW, I wonder if it would make sense to set a smaller SO_RCVTIMEO, if we're going to do blocking recv()? That might cause us to bail with EAGAIN more quickly, and give a chance for things to recover, even if the kernel is dropping stuff.

@xiyuan: Thanks for the pointers! I'll take that for a spin.
> https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Awesome! This dies rather quickly [1] on chell (kernel 3.18), and it succeeds (although I have to add some error handling, as we fail on creating so many pthreads? I typically only manage to create ~500) on scarlet (kernel 4.4). So this does seem like a good litmus test.

[1] With the expected timeout; e.g.:

[3543] glibc: check_pf: netlink socket read timeout
Initial testing shows that simply backporting this is enough to fix $subject:

c0bb07df7d98 netlink: Reset portid after netlink_insert failure

I've ported the follow-up fixes (which replace that solution with an alternative) too, since the above is claimed to introduce additional regressions. This also tests out OK using the above litmus test.

It's not yet clear to me whether I *really* need to port this from 4.0-stable, since it sounds like the racy rhashtable behavior was later fixed upstream in rhashtable (hence the stable-only patch), and we've already backported quite a lot of rhashtable work:

cf8befcc1a55 netlink: Disable insertions/removals during rehash

Anyway, I'll upload it all soon, in case anyone wants to review or give it a spin. I've only done minimal verification so far.
https://chromium-review.googlesource.com/#/c/chromiumos/third_party/kernel/+/1091452 UPSTREAM: netlink: Reset portid after netlink_insert failure
https://chromium-review.googlesource.com/#/c/chromiumos/third_party/kernel/+/1091453 UPSTREAM: netlink: Use default rhashtable hashfn
https://chromium-review.googlesource.com/#/c/chromiumos/third_party/kernel/+/1091454 UPSTREAM: netlink: Fix autobind race condition that leads to zero port ID
https://chromium-review.googlesource.com/#/c/chromiumos/third_party/kernel/+/1091455 BACKPORT: FROMGIT: netlink: Disable insertions/removals during rehash
Status: Started (was: Assigned)
And one more...

https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1091506 UPSTREAM: netlink: Replace rhash_portid with bound
Project Member

Comment 9 by bugdroid1@chromium.org, Jun 13 2018

Labels: merge-merged-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/0cdaad0cc65125c1a0726ed347b627ff662ee77f

commit 0cdaad0cc65125c1a0726ed347b627ff662ee77f
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed Jun 13 04:50:32 2018

UPSTREAM: netlink: Reset portid after netlink_insert failure

The commit c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink:
eliminate nl_sk_hash_lock") breaks the autobind retry mechanism
because it doesn't reset portid after a failed netlink_insert.

This means that should autobind fail the first time around, then
the socket will be stuck in limbo as it can never be bound again
since it already has a non-zero portid.

Fixes: c5adde9468b0 ("netlink: eliminate nl_sk_hash_lock")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c0bb07df7d981e4091432754e30c9c720e2c0c78)

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I5cfee0c833c70f1ad3b82e3f6d4cf5bee189256e
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091452
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/0cdaad0cc65125c1a0726ed347b627ff662ee77f/net/netlink/af_netlink.c

Project Member

Comment 10 by bugdroid1@chromium.org, Jun 13 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/7c67f79118352aefa4f470136e799c3945ea944e

commit 7c67f79118352aefa4f470136e799c3945ea944e
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed Jun 13 04:50:33 2018

UPSTREAM: netlink: Use default rhashtable hashfn

This patch removes the explicit jhash value for the hashfn parameter
of rhashtable.  As the key length is a multiple of 4, this means that
we will actually end up using jhash2.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 11b58ba146ccd7b105c4962c75f2e744053c85bc)

BUG= chromium:821607 ,  chromium:849872 
TEST=build and boot

Change-Id: Ifd74f8ccc3be372ede6105fee47d832e95bf73b7
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091453
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/7c67f79118352aefa4f470136e799c3945ea944e/net/netlink/af_netlink.c

Project Member

Comment 11 by bugdroid1@chromium.org, Jun 13 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/1f735d252607b26f34fd88aae42e2cd6471b7861

commit 1f735d252607b26f34fd88aae42e2cd6471b7861
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed Jun 13 04:50:35 2018

UPSTREAM: netlink: Fix autobind race condition that leads to zero port ID

The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink:
Reset portid after netlink_insert failure") introduced a race
condition where if two threads try to autobind the same socket
one of them may end up with a zero port ID.  This led to kernel
deadlocks that were observed by multiple people.

This patch reverts that commit and instead fixes it by introducing
a separte rhash_portid variable so that the real portid is only set
after the socket has been successfully hashed.

Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure")
Reported-by: Tejun Heo <tj@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1f770c0a09da855a2b51af6d19de97fb955eca85)

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I065a53d0d8a897ce648e4a6e99b6fc28e3f46625
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091454
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/1f735d252607b26f34fd88aae42e2cd6471b7861/net/netlink/af_netlink.c
[modify] https://crrev.com/1f735d252607b26f34fd88aae42e2cd6471b7861/net/netlink/af_netlink.h

Project Member

Comment 12 by bugdroid1@chromium.org, Jun 13 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/663fa53dcb6f4844a8c5ba29dc6bb65ebebfadf9

commit 663fa53dcb6f4844a8c5ba29dc6bb65ebebfadf9
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed Jun 13 04:50:36 2018

BACKPORT: FROMGIT: netlink: Disable insertions/removals during rehash

[ Upstream commit: Not applicable ]

The current rhashtable rehash code is buggy and can't deal with
parallel insertions/removals without corrupting the hash table.

This patch disables it by partially reverting
c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink: eliminate
nl_sk_hash_lock").

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cf8befcc1a5538b035d478424efcc2d50e66928e
 git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.0.y)

Conflicts:
   net/netlink/af_netlink.c
[rhashtable_remove_fast vs. rhashtable_remove]

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I6063076587c0a9ede57e319989a426ee6f6ebe61
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091455
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/663fa53dcb6f4844a8c5ba29dc6bb65ebebfadf9/net/netlink/af_netlink.c

Project Member

Comment 13 by bugdroid1@chromium.org, Jun 13 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/32a47cf484dc30229a88cb0746076be99799bc3e

commit 32a47cf484dc30229a88cb0746076be99799bc3e
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed Jun 13 04:50:38 2018

UPSTREAM: netlink: Replace rhash_portid with bound

On Mon, Sep 21, 2015 at 02:20:22PM -0400, Tejun Heo wrote:
>
> store_release and load_acquire are different from the usual memory
> barriers and can't be paired this way.  You have to pair store_release
> and load_acquire.  Besides, it isn't a particularly good idea to

OK I've decided to drop the acquire/release helpers as they don't
help us at all and simply pessimises the code by using full memory
barriers (on some architectures) where only a write or read barrier
is needed.

> depend on memory barriers embedded in other data structures like the
> above.  Here, especially, rhashtable_insert() would have write barrier
> *before* the entry is hashed not necessarily *after*, which means that
> in the above case, a socket which appears to have set bound to a
> reader might not visible when the reader tries to look up the socket
> on the hashtable.

But you are right we do need an explicit write barrier here to
ensure that the hashing is visible.

> There's no reason to be overly smart here.  This isn't a crazy hot
> path, write barriers tend to be very cheap, store_release more so.
> Please just do smp_store_release() and note what it's paired with.

It's not about being overly smart.  It's about actually understanding
what's going on with the code.  I've seen too many instances of
people simply sprinkling synchronisation primitives around without
any knowledge of what is happening underneath, which is just a recipe
for creating hard-to-debug races.

> > @@ -1539,7 +1546,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr,
> >  		}
> >  	}
> >
> > -	if (!nlk->portid) {
> > +	if (!nlk->bound) {
>
> I don't think you can skip load_acquire here just because this is the
> second deref of the variable.  That doesn't change anything.  Race
> condition could still happen between the first and second tests and
> skipping the second would lead to the same kind of bug.

The reason this one is OK is because we do not use nlk->portid or
try to get nlk from the hash table before we return to user-space.

However, there is a real bug here that none of these acquire/release
helpers discovered.  The two bound tests here used to be a single
one.  Now that they are separate it is entirely possible for another
thread to come in the middle and bind the socket.  So we need to
repeat the portid check in order to maintain consistency.

> > @@ -1587,7 +1594,7 @@ static int netlink_connect(struct socket *sock, struct sockaddr *addr,
> >  	    !netlink_allowed(sock, NL_CFG_F_NONROOT_SEND))
> >  		return -EPERM;
> >
> > -	if (!nlk->portid)
> > +	if (!nlk->bound)
>
> Don't we need load_acquire here too?  Is this path holding a lock
> which makes that unnecessary?

Ditto.

---8<---
The commit 1f770c0a09da855a2b51af6d19de97fb955eca85 ("netlink:
Fix autobind race condition that leads to zero port ID") created
some new races that can occur due to inconcsistencies between the
two port IDs.

Tejun is right that a barrier is unavoidable.  Therefore I am
reverting to the original patch that used a boolean to indicate
that a user netlink socket has been bound.

Barriers have been added where necessary to ensure that a valid
portid and the hashed socket is visible.

I have also changed netlink_insert to only return EBUSY if the
socket is bound to a portid different to the requested one.  This
combined with only reading nlk->bound once in netlink_bind fixes
a race where two threads that bind the socket at the same time
with different port IDs may both succeed.

Fixes: 1f770c0a09da ("netlink: Fix autobind race condition that leads to zero port ID")
Reported-by: Tejun Heo <tj@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Nacked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit da314c9923fed553a007785a901fd395b7eb6c19)

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I4baab91ca840fcb07a0844ac9f48dcc71fddd509
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091506
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/32a47cf484dc30229a88cb0746076be99799bc3e/net/netlink/af_netlink.c
[modify] https://crrev.com/32a47cf484dc30229a88cb0746076be99799bc3e/net/netlink/af_netlink.h

Status: Fixed (was: Started)
I believe this is fixed. We'll watch the original problem (and potential merges) on  bug 821607 .
Issue 806125 has been merged into this issue.

Comment 16 by jayhlee@google.com, Jun 18 2018

Status: Assigned (was: Fixed)
we have customers in the field dealing with these issues. Any reason this can't be merged to 68 and even 67 to get the issue resolved for them sooner?
Cc: bhthompson@chromium.org
Labels: Merge-Request-68
I seriously doubt we have enough field testing on M-69 (canary, at the moment) to have confidence in merging this to a near-stable (M-67). I suppose we can try to merge to M-68.

Bernie, WDYT? And do we have a good way of watching for new reports on devices with kernel 3.18?
Project Member

Comment 18 by sheriffbot@chromium.org, Jun 18 2018

Labels: -Merge-Request-68 Hotlist-Merge-Review Merge-Review-68
This bug requires manual review: M68 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: cmasso@(Android), kariahda@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Hotlist-Merge-Review -Merge-Review-68 Merge-Approved-68
Merging to 68 seems reasonable, we just need to watch crash reports. 
Project Member

Comment 21 by bugdroid1@chromium.org, Jun 18 2018

Labels: merge-merged-release-R68-10718.B-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/5079028f7918373835f76e13e2825e35230524fc

commit 5079028f7918373835f76e13e2825e35230524fc
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon Jun 18 18:43:26 2018

UPSTREAM: netlink: Reset portid after netlink_insert failure

The commit c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink:
eliminate nl_sk_hash_lock") breaks the autobind retry mechanism
because it doesn't reset portid after a failed netlink_insert.

This means that should autobind fail the first time around, then
the socket will be stuck in limbo as it can never be bound again
since it already has a non-zero portid.

Fixes: c5adde9468b0 ("netlink: eliminate nl_sk_hash_lock")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c0bb07df7d981e4091432754e30c9c720e2c0c78)

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I5cfee0c833c70f1ad3b82e3f6d4cf5bee189256e
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091452
Reviewed-by: Grant Grundler <grundler@chromium.org>
(cherry picked from commit 0cdaad0cc65125c1a0726ed347b627ff662ee77f)
Reviewed-on: https://chromium-review.googlesource.com/1104942

[modify] https://crrev.com/5079028f7918373835f76e13e2825e35230524fc/net/netlink/af_netlink.c

Project Member

Comment 22 by bugdroid1@chromium.org, Jun 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e00564d51de68cca9f008bfc6963d79e9bb4e852

commit e00564d51de68cca9f008bfc6963d79e9bb4e852
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon Jun 18 18:43:33 2018

UPSTREAM: netlink: Use default rhashtable hashfn

This patch removes the explicit jhash value for the hashfn parameter
of rhashtable.  As the key length is a multiple of 4, this means that
we will actually end up using jhash2.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 11b58ba146ccd7b105c4962c75f2e744053c85bc)

BUG= chromium:821607 ,  chromium:849872 
TEST=build and boot

Change-Id: Ifd74f8ccc3be372ede6105fee47d832e95bf73b7
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091453
Reviewed-by: Grant Grundler <grundler@chromium.org>
(cherry picked from commit 7c67f79118352aefa4f470136e799c3945ea944e)
Reviewed-on: https://chromium-review.googlesource.com/1104943

[modify] https://crrev.com/e00564d51de68cca9f008bfc6963d79e9bb4e852/net/netlink/af_netlink.c

Project Member

Comment 23 by bugdroid1@chromium.org, Jun 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3520cd6a3ad97bb052dfa4a0928baab974117a9d

commit 3520cd6a3ad97bb052dfa4a0928baab974117a9d
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon Jun 18 18:43:51 2018

UPSTREAM: netlink: Fix autobind race condition that leads to zero port ID

The commit c0bb07df7d981e4091432754e30c9c720e2c0c78 ("netlink:
Reset portid after netlink_insert failure") introduced a race
condition where if two threads try to autobind the same socket
one of them may end up with a zero port ID.  This led to kernel
deadlocks that were observed by multiple people.

This patch reverts that commit and instead fixes it by introducing
a separte rhash_portid variable so that the real portid is only set
after the socket has been successfully hashed.

Fixes: c0bb07df7d98 ("netlink: Reset portid after netlink_insert failure")
Reported-by: Tejun Heo <tj@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1f770c0a09da855a2b51af6d19de97fb955eca85)

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I065a53d0d8a897ce648e4a6e99b6fc28e3f46625
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091454
Reviewed-by: Grant Grundler <grundler@chromium.org>
(cherry picked from commit 1f735d252607b26f34fd88aae42e2cd6471b7861)
Reviewed-on: https://chromium-review.googlesource.com/1104944

[modify] https://crrev.com/3520cd6a3ad97bb052dfa4a0928baab974117a9d/net/netlink/af_netlink.c
[modify] https://crrev.com/3520cd6a3ad97bb052dfa4a0928baab974117a9d/net/netlink/af_netlink.h

Project Member

Comment 24 by bugdroid1@chromium.org, Jun 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e7fd03b8b5253057ce756b9e854d46e2ae9e771e

commit e7fd03b8b5253057ce756b9e854d46e2ae9e771e
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon Jun 18 18:43:58 2018

BACKPORT: FROMGIT: netlink: Disable insertions/removals during rehash

[ Upstream commit: Not applicable ]

The current rhashtable rehash code is buggy and can't deal with
parallel insertions/removals without corrupting the hash table.

This patch disables it by partially reverting
c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink: eliminate
nl_sk_hash_lock").

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cf8befcc1a5538b035d478424efcc2d50e66928e
 git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.0.y)

Conflicts:
   net/netlink/af_netlink.c
[rhashtable_remove_fast vs. rhashtable_remove]

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I6063076587c0a9ede57e319989a426ee6f6ebe61
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091455
Reviewed-by: Grant Grundler <grundler@chromium.org>
(cherry picked from commit 663fa53dcb6f4844a8c5ba29dc6bb65ebebfadf9)
Reviewed-on: https://chromium-review.googlesource.com/1104945

[modify] https://crrev.com/e7fd03b8b5253057ce756b9e854d46e2ae9e771e/net/netlink/af_netlink.c

Project Member

Comment 25 by bugdroid1@chromium.org, Jun 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/76eb986d89481d1fe9c9319eeeb77bab7d4afccb

commit 76eb986d89481d1fe9c9319eeeb77bab7d4afccb
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon Jun 18 18:44:02 2018

UPSTREAM: netlink: Replace rhash_portid with bound

On Mon, Sep 21, 2015 at 02:20:22PM -0400, Tejun Heo wrote:
>
> store_release and load_acquire are different from the usual memory
> barriers and can't be paired this way.  You have to pair store_release
> and load_acquire.  Besides, it isn't a particularly good idea to

OK I've decided to drop the acquire/release helpers as they don't
help us at all and simply pessimises the code by using full memory
barriers (on some architectures) where only a write or read barrier
is needed.

> depend on memory barriers embedded in other data structures like the
> above.  Here, especially, rhashtable_insert() would have write barrier
> *before* the entry is hashed not necessarily *after*, which means that
> in the above case, a socket which appears to have set bound to a
> reader might not visible when the reader tries to look up the socket
> on the hashtable.

But you are right we do need an explicit write barrier here to
ensure that the hashing is visible.

> There's no reason to be overly smart here.  This isn't a crazy hot
> path, write barriers tend to be very cheap, store_release more so.
> Please just do smp_store_release() and note what it's paired with.

It's not about being overly smart.  It's about actually understanding
what's going on with the code.  I've seen too many instances of
people simply sprinkling synchronisation primitives around without
any knowledge of what is happening underneath, which is just a recipe
for creating hard-to-debug races.

> > @@ -1539,7 +1546,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr,
> >  		}
> >  	}
> >
> > -	if (!nlk->portid) {
> > +	if (!nlk->bound) {
>
> I don't think you can skip load_acquire here just because this is the
> second deref of the variable.  That doesn't change anything.  Race
> condition could still happen between the first and second tests and
> skipping the second would lead to the same kind of bug.

The reason this one is OK is because we do not use nlk->portid or
try to get nlk from the hash table before we return to user-space.

However, there is a real bug here that none of these acquire/release
helpers discovered.  The two bound tests here used to be a single
one.  Now that they are separate it is entirely possible for another
thread to come in the middle and bind the socket.  So we need to
repeat the portid check in order to maintain consistency.

> > @@ -1587,7 +1594,7 @@ static int netlink_connect(struct socket *sock, struct sockaddr *addr,
> >  	    !netlink_allowed(sock, NL_CFG_F_NONROOT_SEND))
> >  		return -EPERM;
> >
> > -	if (!nlk->portid)
> > +	if (!nlk->bound)
>
> Don't we need load_acquire here too?  Is this path holding a lock
> which makes that unnecessary?

Ditto.

---8<---
The commit 1f770c0a09da855a2b51af6d19de97fb955eca85 ("netlink:
Fix autobind race condition that leads to zero port ID") created
some new races that can occur due to inconcsistencies between the
two port IDs.

Tejun is right that a barrier is unavoidable.  Therefore I am
reverting to the original patch that used a boolean to indicate
that a user netlink socket has been bound.

Barriers have been added where necessary to ensure that a valid
portid and the hashed socket is visible.

I have also changed netlink_insert to only return EBUSY if the
socket is bound to a portid different to the requested one.  This
combined with only reading nlk->bound once in netlink_bind fixes
a race where two threads that bind the socket at the same time
with different port IDs may both succeed.

Fixes: 1f770c0a09da ("netlink: Fix autobind race condition that leads to zero port ID")
Reported-by: Tejun Heo <tj@kernel.org>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Nacked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit da314c9923fed553a007785a901fd395b7eb6c19)

BUG= chromium:821607 ,  chromium:849872 
TEST=netlink send/recv repeatedly, on many threads - watch for timeouts;
     similar to this test code:
     https://gist.github.com/stevenschlansker/6ad46c5ccb22bc4f3473

Change-Id: I4baab91ca840fcb07a0844ac9f48dcc71fddd509
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1091506
Reviewed-by: Grant Grundler <grundler@chromium.org>
(cherry picked from commit 32a47cf484dc30229a88cb0746076be99799bc3e)
Reviewed-on: https://chromium-review.googlesource.com/1104946

[modify] https://crrev.com/76eb986d89481d1fe9c9319eeeb77bab7d4afccb/net/netlink/af_netlink.c
[modify] https://crrev.com/76eb986d89481d1fe9c9319eeeb77bab7d4afccb/net/netlink/af_netlink.h

Labels: -Merge-Approved-68 Merge-Merged-68
Status: Fixed (was: Assigned)
Merged to M-68.

Sign in to add a comment