New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 736668 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

chromeos-3.18 built with USE="lockdebug" triggers rcu warnings

Project Member Reported by sonnyrao@chromium.org, Jun 25 2017

Issue description


If I build 3.18 for caroline with USE=lockdebug then I consistently see this warning during boot:


[    9.256422] ===============================
[    9.256428] [ INFO: suspicious RCU usage. ]
[    9.256435] 3.18.0 #57 Not tainted
[    9.256440] -------------------------------
[    9.256446] /mnt/host/source/src/third_party/kernel/v3.18/net/ipv4/tcp_ipv4.c:905 suspicious rcu_dereference_check() usage!
[    9.256457] 
               other info that might help us debug this:

[    9.256467] 
               rcu_scheduler_active = 1, debug_locks = 0
[    9.256475] no locks held by Chrome_IOThread/1875.
[    9.256480] 
               stack backtrace:
[    9.256489] CPU: 2 PID: 1875 Comm: Chrome_IOThread Not tainted 3.18.0 #57
[    9.256497] Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.263.0 01/26/2017
[    9.256506]  0000000000000000 00000000802e81a7 ffff8801732ffa08 ffffffff87aeb1a8
[    9.256521]  0000000000000000 ffff880178335870 ffff8801732ffa38 ffffffff874a4503
[    9.256534]  0000000000000000 ffff8801645489c0 0000000000000002 ffff8801645489c0
[    9.256547] Call Trace:
[    9.256558]  [<ffffffff87aeb1a8>] dump_stack+0x4e/0x71
[    9.256570]  [<ffffffff874a4503>] lockdep_rcu_suspicious+0x107/0x110
[    9.256581]  [<ffffffff87a5eb02>] tcp_md5_do_lookup+0x66/0xde
[    9.256590]  [<ffffffff87a5eb8d>] tcp_v4_md5_lookup+0x13/0x15
[    9.256599]  [<ffffffff87a596f9>] tcp_established_options+0x2d/0xad
[    9.256608]  [<ffffffff87a5aeec>] tcp_transmit_skb+0x1ba/0x7b7
[    9.256618]  [<ffffffff8755c13a>] ? ksize+0x18/0x79
[    9.256627]  [<ffffffff87a5ca04>] tcp_send_ack+0xce/0xd3
[    9.256637]  [<ffffffff87a52a76>] __tcp_ack_snd_check+0x49/0x7e
[    9.256646]  [<ffffffff87a58745>] tcp_rcv_established+0x2f9/0x4c5
[    9.256656]  [<ffffffff87a614d7>] tcp_v4_do_rcv+0xb4/0x2a7
[    9.256666]  [<ffffffff879de040>] sk_backlog_rcv+0x1a/0x2f
[    9.256674]  [<ffffffff879de9b2>] release_sock+0xba/0x157
[    9.256684]  [<ffffffff87a5064f>] tcp_recvmsg+0x7d0/0xa34
[    9.256693]  [<ffffffff87a76136>] inet_recvmsg+0x6e/0x9b
[    9.256702]  [<ffffffff879d8123>] __sock_recvmsg_nosec+0x29/0x2b
[    9.256711]  [<ffffffff879d8dd7>] sock_aio_read.part.3+0xd3/0xf9
[    9.256721]  [<ffffffff879d8e20>] sock_aio_read+0x23/0x25
[    9.256730]  [<ffffffff875698ce>] do_sync_read+0x67/0x9e

It looks like we backported these:
1e1d04e678cf net: introduce lockdep_is_held and update various places to use it
1b8e6a01e19f tcp: md5: fix lockdep annotation

but didn't pick this up:
61881cfb5ad8 sock: fix lockdep annotation in release_sock

and so we are seeing a spruious warning

I'll put a CL up to backport the annotation fix

 
it looks like that same situation where we picked up those two but not the release_sock fix applies to 3.14 as well so I will double check if I can see it on Samus as well
3.14 does need 
61881cfb5ad8 sock: fix lockdep annotation in release_sock 
and
5e91f6ce4c58 sock: relax WARN_ON() in sock_owned_by_user()

so I've put up CLs for those

however it seems to hit several possibly legitimate problems with USE="lockdebug" 
so I'll open a separate bug for those
Summary: chromeos-3.18 built with USE="lockdebug" triggers rcu warnings (was: chromeos-3.18 built with USE="lockdebug" triggers rcu warning on tcp_md5_do_lookup())
I see another splat on 3.18 after boot:

[ 1337.198366] ===============================
[ 1337.198371] [ INFO: suspicious RCU usage. ]
[ 1337.198378] 3.18.0 #59 Not tainted
[ 1337.198383] -------------------------------
[ 1337.198390] /mnt/host/source/src/third_party/kernel/v3.18/net/ipv6/tcp_ipv6.c:507 suspicious rcu_dereference_check() usage!
[ 1337.198401] 
               other info that might help us debug this:

[ 1337.198411] 
               rcu_scheduler_active = 1, debug_locks = 0
[ 1337.198419] 1 lock held by sshd/4046:
[ 1337.198424]  #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff82a4d5d2>] inet_csk_accept+0x35/0x27d
[ 1337.198447] 
               stack backtrace:
[ 1337.198456] CPU: 2 PID: 4046 Comm: sshd Not tainted 3.18.0 #59
[ 1337.198463] Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.263.0 01/26/2017
[ 1337.198472]  0000000000000000 0000000047910521 ffff88006415fb28 ffffffff82aeb1c8
[ 1337.198486]  0000000000000000 ffff880064033510 ffff88006415fb58 ffffffff824a4503
[ 1337.198499]  0000000000000000 ffff88005bcd8000 ffff88002dc76f00 ffff88006415fc00
[ 1337.198511] Call Trace:
[ 1337.198521]  [<ffffffff82aeb1c8>] dump_stack+0x4e/0x71
[ 1337.198532]  [<ffffffff824a4503>] lockdep_rcu_suspicious+0x107/0x110
[ 1337.198541]  [<ffffffff82acca9c>] tcp_v6_send_synack+0x10c/0x13a
[ 1337.198550]  [<ffffffff82a55598>] tcp_conn_request+0x581/0x619
[ 1337.198559]  [<ffffffff82acd083>] tcp_v6_conn_request+0x55/0x73
[ 1337.198567]  [<ffffffff82a58ba5>] tcp_rcv_state_process+0x7b/0xa69
[ 1337.198577]  [<ffffffff82ade55f>] ? __inet6_lookup_established+0x18d/0x1a3
[ 1337.198586]  [<ffffffff82ace533>] tcp_v6_do_rcv+0x2b8/0x414
[ 1337.198596]  [<ffffffff829de040>] sk_backlog_rcv+0x1a/0x2f
[ 1337.198604]  [<ffffffff829de99d>] release_sock+0xa5/0x160
[ 1337.198613]  [<ffffffff82a4d7c2>] inet_csk_accept+0x225/0x27d
[ 1337.198622]  [<ffffffff82a76775>] inet_accept+0x3c/0xf5
[ 1337.198630]  [<ffffffff829d9add>] SYSC_accept4+0x122/0x1e5
[ 1337.198640]  [<ffffffff82af1e41>] ? sysret_check+0x1b/0x56
[ 1337.198650]  [<ffffffff824a55b3>] ? trace_hardirqs_on_caller+0x183/0x19f
[ 1337.198660]  [<ffffffff829dc06d>] SyS_accept+0x10/0x12
[ 1337.198670]  [<ffffffff82af1e1c>] system_call_fastpath+0x1c/0x21

should be fixed by: 
3e4006f0b86a ipv6: tcp: add rcu locking in tcp_v6_send_synack()

Cc: abhishekbh@google.com
Project Member

Comment 5 by bugdroid1@chromium.org, Jun 27 2017

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11

commit 9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11
Author: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue Jun 27 22:51:56 2017

UPSTREAM: sock: fix lockdep annotation in release_sock

During release_sock we use callbacks to finish the processing
of outstanding skbs on the socket. We actually are still locked,
sk_locked.owned == 1, but we already told lockdep that the mutex
is released. This could lead to false positives in lockdep for
lockdep_sock_is_held (we don't hold the slock spinlock during processing
the outstanding skbs).

I took over this patch from Eric Dumazet and tested it.

BUG=chromium:736668
TEST=build kernel with USE="lockdebug" and boot samus

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 61881cfb5ad80c1d0a46ca6d08b7e271892b2ff6)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: Ifdee959ccb945c2eab999a36a88b2b28f77f7c7b
Reviewed-on: https://chromium-review.googlesource.com/547566
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>

[modify] https://crrev.com/9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11/net/core/sock.c
[modify] https://crrev.com/9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11/include/net/sock.h

Project Member

Comment 6 by bugdroid1@chromium.org, Jun 27 2017

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11

commit 9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11
Author: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue Jun 27 22:51:56 2017

UPSTREAM: sock: fix lockdep annotation in release_sock

During release_sock we use callbacks to finish the processing
of outstanding skbs on the socket. We actually are still locked,
sk_locked.owned == 1, but we already told lockdep that the mutex
is released. This could lead to false positives in lockdep for
lockdep_sock_is_held (we don't hold the slock spinlock during processing
the outstanding skbs).

I took over this patch from Eric Dumazet and tested it.

BUG=chromium:736668
TEST=build kernel with USE="lockdebug" and boot samus

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 61881cfb5ad80c1d0a46ca6d08b7e271892b2ff6)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: Ifdee959ccb945c2eab999a36a88b2b28f77f7c7b
Reviewed-on: https://chromium-review.googlesource.com/547566
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>

[modify] https://crrev.com/9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11/net/core/sock.c
[modify] https://crrev.com/9a0dc29ddc7e3435b980277bd08d3b9eb1f98c11/include/net/sock.h

Project Member

Comment 7 by bugdroid1@chromium.org, Jun 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/53d228bcd1a58379f3ff839ea5be9d725aa9ecc9

commit 53d228bcd1a58379f3ff839ea5be9d725aa9ecc9
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Jun 27 22:51:57 2017

UPSTREAM: sock: relax WARN_ON() in sock_owned_by_user()

Valdis reported tons of stack dumps caused by WARN_ON() in
sock_owned_by_user()

This test needs to be relaxed if/when lockdep disables itself.

Note that other lockdep_sock_is_held() callers are all from
rcu_dereference_protected() sections which already are disabled
if/when lockdep has been disabled.

BUG=chromium:736668
TEST=build kernel with USE="lockdebug" and boot samus

Fixes: fafc4e1ea1a4 ("sock: tigthen lockdep checks for sock_owned_by_user")
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5e91f6ce4c584d231763437a3ea3aded8e672363)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: I6d2bdc2dbf96d40ba7d0699b7d1ed5559bcbbc80
Reviewed-on: https://chromium-review.googlesource.com/547567
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>

[modify] https://crrev.com/53d228bcd1a58379f3ff839ea5be9d725aa9ecc9/include/net/sock.h

Project Member

Comment 8 by bugdroid1@chromium.org, Jun 27 2017

Labels: merge-merged-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/5bb07eaa90f61982225ac03db3e3c926bb41f505

commit 5bb07eaa90f61982225ac03db3e3c926bb41f505
Author: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue Jun 27 22:52:08 2017

UPSTREAM: sock: fix lockdep annotation in release_sock

During release_sock we use callbacks to finish the processing
of outstanding skbs on the socket. We actually are still locked,
sk_locked.owned == 1, but we already told lockdep that the mutex
is released. This could lead to false positives in lockdep for
lockdep_sock_is_held (we don't hold the slock spinlock during processing
the outstanding skbs).

I took over this patch from Eric Dumazet and tested it.

BUG=chromium:736668
TEST=build kernel with USE="lockdebug" and boot caroline

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 61881cfb5ad80c1d0a46ca6d08b7e271892b2ff6)
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: Ifdee959ccb945c2eab999a36a88b2b28f77f7c7b
Reviewed-on: https://chromium-review.googlesource.com/547664
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Kevin Cernekee <cernekee@chromium.org>

[modify] https://crrev.com/5bb07eaa90f61982225ac03db3e3c926bb41f505/net/core/sock.c
[modify] https://crrev.com/5bb07eaa90f61982225ac03db3e3c926bb41f505/include/net/sock.h

Project Member

Comment 9 by bugdroid1@chromium.org, Jun 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e8b9786ad2367583519ab34288670be9758f5742

commit e8b9786ad2367583519ab34288670be9758f5742
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Jun 27 22:52:11 2017

BACKPORT: ipv6: tcp: add rcu locking in tcp_v6_send_synack()

When first SYNACK is sent, we already hold rcu_read_lock(), but this
is not true if a SYNACK is retransmitted, as a timer (soft) interrupt
does not hold rcu_read_lock()

BUG=chromium:736668
TEST=build/boot caroline with USE="lockdebug" and no more splat

Fixes: 45f6fad84cc30 ("ipv6: add complete rcu protection around np->opt")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3e4006f0b86a5ae5eb0e8215f9a9e1db24506977)
 Conflicts:
	net/ipv6/tcp_ipv6.c
[SR: trivial conflict from missing
     dc6ef6be5215 tcp: do not set queue_mapping on SYNACK]
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

Change-Id: Ic50a29f0744c13d74be0a8e9d5eee2c9c99307ed
Reviewed-on: https://chromium-review.googlesource.com/549112
Commit-Ready: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>

[modify] https://crrev.com/e8b9786ad2367583519ab34288670be9758f5742/net/ipv6/tcp_ipv6.c

This fixes quite a few splats but I'm seeing lockdep get disabled by something in i915:

[    0.519077] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    0.519079] [drm] Driver supports precise vblank timestamp query.
[    0.527187] INFO: trying to register non-static key.
[    0.527196] the code is fine but needs lockdep annotation.
[    0.527197] turning off the locking correctness validator.
[    0.527201] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.18.0 #61
[    0.527203] Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.263.0 01/26/2017
[    0.527205]  0000000000000000 00000000f8ccad55 ffff88017a12b518 ffffffff88aebc58
[    0.527210]  0000000000000000 0000000000000000 ffff88017a12b598 ffffffff884a37c1
[    0.527214]  0000000000000005 00000000884a51dc ffff88017a12b618 ffffffff884a5f19
[    0.527217] Call Trace:
[    0.527224]  [<ffffffff88aebc58>] dump_stack+0x4e/0x71
[    0.527249]  [<ffffffff884a37c1>] register_lock_class+0x185/0x3ac
[    0.527253]  [<ffffffff884a5f19>] ? __lock_acquire+0x67d/0x1003
[    0.527257]  [<ffffffff884a59f4>] __lock_acquire+0x158/0x1003
[    0.527260]  [<ffffffff884a6c75>] lock_acquire+0x10e/0x195
[    0.527264]  [<ffffffff88af0e33>] ? rt_mutex_slowlock+0x5d/0x1ef
[    0.527268]  [<ffffffff88af1bc8>] _raw_spin_lock+0x34/0x44
[    0.527271]  [<ffffffff88af0e33>] ? rt_mutex_slowlock+0x5d/0x1ef
[    0.527274]  [<ffffffff88af0e33>] rt_mutex_slowlock+0x5d/0x1ef
[    0.527278]  [<ffffffff88af0ff7>] rt_mutex_lock+0x32/0x36
[    0.527282]  [<ffffffff8890c6e6>] i2c_lock_adapter+0x32/0x36
[    0.527285]  [<ffffffff8890e213>] i2c_transfer+0x55/0x7e
[    0.527289]  [<ffffffff8877ca5a>] drm_do_probe_ddc_edid+0xc9/0x13b
[    0.527292]  [<ffffffff8877caf8>] drm_probe_ddc+0x2c/0x47
[    0.527295]  [<ffffffff8877d681>] drm_get_edid+0x24/0x357
[    0.527300]  [<ffffffff88811fa4>] intel_dp_init_connector+0x8bc/0xc28
[    0.527303]  [<ffffffff888099bd>] intel_ddi_init+0x1e8/0x267
[    0.527307]  [<ffffffff887f4e8b>] intel_modeset_init+0x891/0x115d
[    0.527310]  [<ffffffff88825cc3>] i915_driver_load+0xf50/0x10a6
[    0.527314]  [<ffffffff884a5419>] ? mark_held_locks+0x5d/0x74
[    0.527318]  [<ffffffff88af1e44>] ? _raw_spin_unlock_irqrestore+0x3f/0x57
[    0.527321]  [<ffffffff887701c0>] drm_dev_register+0x87/0xc9
[    0.527324]  [<ffffffff8877220a>] drm_get_pci_dev+0x102/0x1bc
[    0.527328]  [<ffffffff88789a43>] i915_pci_probe+0x4f/0x51
[    0.527332]  [<ffffffff886d57c7>] pci_device_probe+0x6c/0xcc
[    0.527335]  [<ffffffff8882fa79>] driver_probe_device+0x109/0x2b3
[    0.527338]  [<ffffffff8882fced>] __driver_attach+0x62/0x85
[    0.527341]  [<ffffffff8882fc8b>] ? __device_attach_driver+0x68/0x68
[    0.527343]  [<ffffffff8882eb0a>] bus_for_each_dev+0x8f/0xb2
[    0.527360]  [<ffffffff8882f45f>] driver_attach+0x1e/0x20
[    0.527362]  [<ffffffff8882f177>] bus_add_driver+0xee/0x1e6
[    0.527365]  [<ffffffff888305eb>] driver_register+0x8f/0xcc
[    0.527367]  [<ffffffff886d4ac2>] __pci_register_driver+0x61/0x65
Components: OS>Kernel
Status: Untriaged (was: Unconfirmed)
Status: Assigned (was: Untriaged)

Sign in to add a comment