New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 774279 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

kernel 4.4: boot-time WARN() - "nf_unregister_net_hook: hook not found!"

Project Member Reported by diand...@chromium.org, Oct 12 2017

Issue description

Forked from b/35586952

Basically at boot time, sometimes you see:

[    2.446125] nf_unregister_net_hook: hook not found! 
[    2.446156] ------------[ cut here ]------------ 
[    2.446161] WARNING: at /mnt/host/source/src/third_party/kernel/v4.4/net/netfilter/core.c:143 
[    2.446165] Modules linked in: ip6table_filter asix usbnet mii joydev 
[    2.446184] 
[    2.446190] CPU: 5 PID: 119 Comm: kworker/u12:2 Not tainted 4.4.35 #752 
[    2.446194] Hardware name: Google Kevin (DT) 
[    2.446210] Workqueue: netns cleanup_net 
[    2.446216] task: ffffffc0ee0a1a00 ti: ffffffc0ee0b0000 task.ti: ffffffc0ee0b0000 
[    2.446224] PC is at nf_unregister_net_hook+0xdc/0x128 
[    2.446229] LR is at nf_unregister_net_hook+0xdc/0x128 
[    2.446233] pc : [<ffffffc000839af4>] lr : [<ffffffc000839af4>] pstate: 60000145 
[    2.446236] sp : ffffffc0ee0b3c20 
[    2.446240] x29: ffffffc0ee0b3c20 x28: ffffffc00112ca60 
[    2.446247] x27: 00000000ffffffff x26: ffffffc00112c990 
[    2.446254] x25: ffffffc0ee0b3d28 x24: ffffffc00112c900 
[    2.446261] x23: ffffffc0eac48000 x22: ffffffc001131000 
[    2.446268] x21: ffffffc0eacda700 x20: ffffffc0eac48e88 
[    2.446274] x19: ffffffc0eac48e88 x18: 0000000000000000 
[    2.446280] x17: 0000000000000000 x16: ffffffc0002ad618 
[    2.446286] x15: 0000000000000000 x14: 00000000b4c53f81 
[    2.446293] x13: 00000000ffa33544 x12: 00000000b4c5ff68 
[    2.446299] x11: 00000000ebcceeb0 x10: 00000000000008e0 
[    2.446306] x9 : ffffffc0ee0b39d0 x8 : ffffffc00043c2d0 
[    2.446313] x7 : ffffffc0010ce1c0 x6 : 0000000000000037 
[    2.446319] x5 : 0000000000000000 x4 : 0000000000000000 
[    2.446325] x3 : 0000000000000000 x2 : cb88537fdc8ba623 
[    2.446331] x1 : cb88537fdc8ba623 x0 : 0000000000000027 
[    2.446338] 
... 
[    2.448889] ---[ end trace 608506de33611ac7 ]--- 
[    2.448893] Call trace: 
[    2.448900] [<ffffffc000839af4>] nf_unregister_net_hook+0xdc/0x128 
[    2.448905] [<ffffffc000839ecc>] netfilter_net_exit+0x40/0x6c 
[    2.448910] [<ffffffc0007f9f0c>] ops_exit_list+0x5c/0x74 
[    2.448914] [<ffffffc0007fb51c>] cleanup_net+0x198/0x274 
[    2.448921] [<ffffffc000239264>] process_one_work+0x240/0x424 
[    2.448925] [<ffffffc000239d00>] worker_thread+0x2fc/0x424 
[    2.448929] [<ffffffc00023f070>] kthread+0x10c/0x114 
[    2.448935] [<ffffffc000203dd0>] ret_from_fork+0x10/0x40 
[    2.449419] nf_unregister_net_hook: hook not found! 
[    2.449446] ------------[ cut here ]------------ 
[    2.449450] WARNING: at /mnt/host/source/src/third_party/kernel/v4.4/net/netfilter/core.c:143 
... [more of this WARNING] ...

---

I added some debugging based on prior investigation that Brian did.  It looks as if this has to do with "net" devices going away as we're registering, somehow.

I added these printouts in:

        pr_info("DOUG: %s: hook fn: %pF (reg=%#x) added to net: %#x\n", __func__, reg->hook, reg, net);
...
        pr_info("DOUG: %s: hook fn: %pF (reg=%#x) deleting from net: %#x\n", __func__, reg->hook, reg, net);

And now I see:

...
[   12.164760] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6800) added to net: 0x13e1480
[   12.179058] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6800) added to net: 0xd9a38000
[   12.189286] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6800) added to net: 0xd9a3e000
[   12.208711] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6800) added to net: 0xd9a3a000
[   12.218973] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6838) added to net: 0x13e1480
[   12.232299] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6838) added to net: 0xd9a3e000
[   12.242409] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6838) added to net: 0xd9a3a000
[   12.255228] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6870) added to net: 0x13e1480
[   12.265232] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6870) added to net: 0xd9a3e000
[   12.275925] DOUG: nf_register_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6870) added to net: 0xd9a3a000
...
[   12.684040] DOUG: nf_unregister_net_hook: hook fn: selinux_ipv4_postroute+0x0/0x34 (reg=0x137cd50) deleting from net: 0xd9a38000
[   12.698319] DOUG: nf_unregister_net_hook: hook fn: selinux_ipv4_forward+0x0/0x34 (reg=0x137cd88) deleting from net: 0xd9a38000
[   12.714693] DOUG: nf_unregister_net_hook: hook fn: selinux_ipv4_output+0x0/0x18 (reg=0x137cdc0) deleting from net: 0xd9a38000
[   12.726314] DOUG: nf_unregister_net_hook: hook fn: selinux_ipv6_postroute+0x0/0x34 (reg=0x137cdf8) deleting from net: 0xd9a38000
[   12.741187] DOUG: nf_unregister_net_hook: hook fn: selinux_ipv6_forward+0x0/0x34 (reg=0x137ce30) deleting from net: 0xd9a38000
[   12.762596] DOUG: nf_unregister_net_hook: hook fn: ipv4_conntrack_in+0x0/0x38 (reg=0x1334f88) deleting from net: 0xd9a38000
[   12.775639] DOUG: nf_unregister_net_hook: hook fn: ipv4_conntrack_local+0x0/0x64 (reg=0x1334fc0) deleting from net: 0xd9a38000
[   12.788638] DOUG: nf_unregister_net_hook: hook fn: ipv4_helper+0x0/0xe4 (reg=0x1334ff8) deleting from net: 0xd9a38000
[   12.800807] DOUG: nf_unregister_net_hook: hook fn: ipv4_confirm+0x0/0x11c (reg=0x1335030) deleting from net: 0xd9a38000
[   12.811766] DOUG: nf_unregister_net_hook: hook fn: ipv4_helper+0x0/0xe4 (reg=0x1335068) deleting from net: 0xd9a38000
[   12.825132] DOUG: nf_unregister_net_hook: hook fn: ipv4_confirm+0x0/0x11c (reg=0x13350a0) deleting from net: 0xd9a38000
[   12.836018] DOUG: nf_unregister_net_hook: hook fn: ipv4_conntrack_defrag+0x0/0x114 (reg=0x13eddb0) deleting from net: 0xd9a38000
[   12.848159] DOUG: nf_unregister_net_hook: hook fn: ipv4_conntrack_defrag+0x0/0x114 (reg=0x13edde8) deleting from net: 0xd9a38000
[   12.860626] DOUG: nf_unregister_net_hook: hook fn: iptable_filter_hook+0x0/0x70 (reg=0xe8779900) deleting from net: 0xd9a38000
[   12.872119] DOUG: nf_unregister_net_hook: hook fn: iptable_filter_hook+0x0/0x70 (reg=0xe8779938) deleting from net: 0xd9a38000
[   12.883692] DOUG: nf_unregister_net_hook: hook fn: iptable_filter_hook+0x0/0x70 (reg=0xe8779970) deleting from net: 0xd9a38000
[   12.895287] DOUG: nf_unregister_net_hook: hook fn: iptable_mangle_hook+0x0/0x120 (reg=0xe8659500) deleting from net: 0xd9a38000
[   12.906848] DOUG: nf_unregister_net_hook: hook fn: iptable_mangle_hook+0x0/0x120 (reg=0xe8659538) deleting from net: 0xd9a38000
[   12.918578] DOUG: nf_unregister_net_hook: hook fn: iptable_mangle_hook+0x0/0x120 (reg=0xe8659570) deleting from net: 0xd9a38000
[   12.930168] DOUG: nf_unregister_net_hook: hook fn: iptable_mangle_hook+0x0/0x120 (reg=0xe86595a8) deleting from net: 0xd9a38000
[   12.941813] DOUG: nf_unregister_net_hook: hook fn: iptable_mangle_hook+0x0/0x120 (reg=0xe86595e0) deleting from net: 0xd9a38000
[   12.953471] DOUG: nf_unregister_net_hook: hook fn: iptable_raw_hook+0x0/0x70 (reg=0xe88fd200) deleting from net: 0xd9a38000
[   12.964778] DOUG: nf_unregister_net_hook: hook fn: iptable_raw_hook+0x0/0x70 (reg=0xe88fd238) deleting from net: 0xd9a38000
[   12.976052] DOUG: nf_unregister_net_hook: hook fn: iptable_security_hook+0x0/0x70 (reg=0xe877a300) deleting from net: 0xd9a38000
[   12.987686] DOUG: nf_unregister_net_hook: hook fn: iptable_security_hook+0x0/0x70 (reg=0xe877a338) deleting from net: 0xd9a38000
[   12.999373] DOUG: nf_unregister_net_hook: hook fn: iptable_security_hook+0x0/0x70 (reg=0xe877a370) deleting from net: 0xd9a38000
[   13.011135] DOUG: nf_unregister_net_hook: hook fn: ip6table_raw_hook+0x0/0x38 (reg=0xe8955800) deleting from net: 0xd9a38000
[   13.022528] DOUG: nf_unregister_net_hook: hook fn: ip6table_raw_hook+0x0/0x38 (reg=0xe8955838) deleting from net: 0xd9a38000
[   13.033873] DOUG: nf_unregister_net_hook: hook fn: ipv6_conntrack_in+0x0/0x38 (reg=0x1336508) deleting from net: 0xd9a38000
[   13.045098] DOUG: nf_unregister_net_hook: hook fn: ipv6_conntrack_local+0x0/0x60 (reg=0x1336540) deleting from net: 0xd9a38000
[   13.056607] DOUG: nf_unregister_net_hook: hook fn: ipv6_helper+0x0/0x130 (reg=0x1336578) deleting from net: 0xd9a38000
[   13.068225] DOUG: nf_unregister_net_hook: hook fn: ipv6_confirm+0x0/0x184 (reg=0x13365b0) deleting from net: 0xd9a38000
[   13.079151] DOUG: nf_unregister_net_hook: hook fn: ipv6_helper+0x0/0x130 (reg=0x13365e8) deleting from net: 0xd9a38000
[   13.089924] DOUG: nf_unregister_net_hook: hook fn: ipv6_confirm+0x0/0x184 (reg=0x1336620) deleting from net: 0xd9a38000
[   13.100856] DOUG: nf_unregister_net_hook: hook fn: ipv6_defrag+0x0/0x14c (reg=0x13f2f68) deleting from net: 0xd9a38000
[   13.112090] DOUG: nf_unregister_net_hook: hook fn: ipv6_defrag+0x0/0x14c (reg=0x13f2fa0) deleting from net: 0xd9a38000
[   13.122948] DOUG: nf_unregister_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6800) deleting from net: 0xd9a38000
[   13.133550] DOUG: nf_unregister_net_hook: hook fn: 0xffffffbffc0f9000 (reg=0xd85f6838) deleting from net: 0xd9a38000
<error>

---

As you can see, when we registered ip6table_filter (AKA 0xffffffbffc0f9000) we walked through one "reg" at a time and added to each "net".  It appears that as we were doing that one of the "net"s went away.

Presumably we did something like this:
* Called nf_register_hooks() for ip6table_filter and started walking through it's 3 "reg" entries.

* Called nf_register_hook() for ip6table_filter entry 0.
* Grabbed rtnl_lock() so nobody could add/remove "net" entries.
* Registered for each "net" entry (all 4)
* Add to the nf_hook_list
* Released the lock.

* Now a "net" starts getting destroyed and gets removed from the list so for_each_net() can't find it anymore.

* Called nf_register_hook() for ip6table_filter entry 1.
* Grabbed rtnl_lock() so nobody could add/remove "net" entries.
* Registered for each "net" entry (all 3)
* Add to the nf_hook_list
* Released the lock.

* Called nf_register_hook() for ip6table_filter entry 2.
* Grabbed rtnl_lock() so nobody could add/remove "net" entries.
* Registered for each "net" entry (all 3)
* Add to the nf_hook_list
* Released the lock.

* Now we start really destroying the "net".  We iterating through all hooks in "nf_hook_list" and try to delete the net.

...and (surprise, surprise) the net we're deleting doesn't have all the hooks.


Note that the above analysis works because the code to handle deleting the happens in cleanup_net() at:

        /* Run all of the network namespace exit methods */
        list_for_each_entry_reverse(ops, &pernet_list, list)
                ops_exit_list(ops, &net_exit_list);

That's _after_ we've dropped the rtnl_lock.

---

It seems like perhaps upstream knows this already, as per this commit:

commit cf56c2f892a8a1870a8358114ad896772da7543a
Author:     Florian Westphal <fw@strlen.de>
AuthorDate: Thu Jul 6 23:17:44 2017 +0200
Commit:     Pablo Neira Ayuso <pablo@netfilter.org>
CommitDate: Mon Jul 17 17:01:10 2017 +0200

    netfilter: remove old pre-netns era hook api
    
    no more users in the tree, remove this.
    
    The old api is racy wrt. module removal, all users have been converted
    to the netns-aware api.
    
    The old api pretended we still have global hooks but that has not been
    true for a long time.
    
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

---

There are _a lot_ of netfilter changes between 4.4 and that commit upstream.  

---

Presumably our options are:

A) Ignore this and just let it keep spewing

B) Downgrade to just a "dev_warn"

C) Pick a whole lot of patches.


I'm inclined to do B for now.
 
Cc: jeffy.c...@rock-chips.com
* https://chromium-review.googlesource.com/#/c/chromiumos/third_party/kernel/+/716675
  CHROMIUM: netfilter: Downgrade netfilter warning that we're ignoring        

Status: Fixed (was: Untriaged)
Project Member

Comment 3 by bugdroid1@chromium.org, Oct 15 2017

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/c4bb556ad70e1946c9a8fdca9dcc13c9f2b6fe08

commit c4bb556ad70e1946c9a8fdca9dcc13c9f2b6fe08
Author: Douglas Anderson <dianders@chromium.org>
Date: Fri Oct 13 21:45:10 2017

CHROMIUM: netfilter: Downgrade netfilter warning that we're ignoring

Probably nobody is going to fix the warning talked about in
chromium:774279 and (as far as we know) it's not causing any serious
problems.  Let's downgrade it to just a normal warning printout
instead of a splat of a stackdump.  This way we don't get in the habit
of ignoring stackdumps that show up at bootup.

From the best I can tell, making a better fix would require us to pick
back a whole lot of patches and change APIs for all the netfilter
components.  Someone could do that if they were interested, but it
would take a bunch of careful work.

BUG= chromium:774279 
TEST=No more splat at boot

Change-Id: I7fa6357ff2c2866ce748bf3d08dbed6e5b3f1342
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/716675
Reviewed-by: Kevin Cernekee <cernekee@chromium.org>

[modify] https://crrev.com/c4bb556ad70e1946c9a8fdca9dcc13c9f2b6fe08/net/netfilter/core.c

Comment 4 by dchan@chromium.org, Jan 22 2018

Status: archived (was: Fixed)

Comment 5 by dchan@chromium.org, Jan 23 2018

Status: Fixed (was: Archived)

Sign in to add a comment