binder: "BUG: sleeping function called from invalid context" |
||||
Issue descriptionObserved once. [58638.944378] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.12/mm/memory.c:1320 [58638.957699] in_atomic(): 1, irqs_disabled(): 0, pid: 51, name: kswapd0 [58638.965016] CPU: 1 PID: 51 Comm: kswapd0 Not tainted 4.12.13 #8 [58638.971646] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.41.0 07/17/2017 [58638.979819] Call Trace: [58638.982562] dump_stack+0x4d/0x63 [58638.986272] ___might_sleep+0x192/0x1a9 [58638.990565] unmap_page_range+0x6db/0x733 [58638.995065] unmap_single_vma+0xad/0xb9 [58638.999364] zap_page_range+0x162/0x185 [58639.003662] ? do_raw_spin_unlock+0xc7/0xd1 [58639.008343] binder_alloc_free_page+0x19c/0x3c3 [58639.013414] __list_lru_walk_one.isra.11+0xb3/0x198 [58639.018872] ? binder_shrink_count+0x19/0x19 [58639.023650] list_lru_walk_node+0xe/0x10 [58639.028038] binder_shrink_scan+0x4c/0x65 [58639.032523] shrink_slab.part.58+0x2b8/0x42b [58639.037300] shrink_node+0xdd/0x2cf [58639.041203] balance_pgdat+0x19e/0x2a5 [58639.045398] kswapd+0x450/0x5b7 [58639.048915] ? wake_up_atomic_t+0x2c/0x2c [58639.053402] ? balance_pgdat+0x2a5/0x2a5 [58639.057790] kthread+0x221/0x231 [58639.061401] ? kthread_flush_work+0x147/0x147 [58639.066276] ret_from_fork+0x22/0x30
,
Sep 20 2017
Also: [ 3546.342638] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.12/kernel/fork.c:927 [ 3546.355705] in_atomic(): 1, irqs_disabled(): 0, pid: 50, name: kswapd0 [ 3546.363048] CPU: 2 PID: 50 Comm: kswapd0 Not tainted 4.12.13 #9 [ 3546.369670] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.41.0 07/17/2017 [ 3546.377842] Call Trace: [ 3546.380576] dump_stack+0x4d/0x63 [ 3546.384296] ___might_sleep+0x192/0x1a9 [ 3546.388591] __might_sleep+0xe1/0xed [ 3546.392801] ? up_write+0x16/0x35 [ 3546.396508] mmput+0x20/0x33 [ 3546.399721] binder_alloc_free_page+0x256/0x3de [ 3546.404801] __list_lru_walk_one.isra.11+0xb3/0x198 [ 3546.410254] ? binder_shrink_count+0x19/0x19 [ 3546.415029] list_lru_walk_node+0xe/0x10 [ 3546.419414] binder_shrink_scan+0x4c/0x65 [ 3546.423896] shrink_slab.part.58+0x2b8/0x42b [ 3546.428670] shrink_node+0xdd/0x2cf [ 3546.432571] balance_pgdat+0x19e/0x2a5 [ 3546.436753] kswapd+0x450/0x5b7 [ 3546.440257] ? wake_up_atomic_t+0x2c/0x2c [ 3546.444729] ? balance_pgdat+0x2a5/0x2a5 [ 3546.449114] kthread+0x221/0x231 [ 3546.452714] ? kthread_flush_work+0x147/0x147 [ 3546.457587] ret_from_fork+0x22/0x30 mmput must not be called in atomic context either.
,
Sep 20 2017
There is a fix for this in progress: https://android-review.googlesource.com/#/c/kernel/common/+/478862/
,
Oct 16 2017
,
Oct 16 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/53936ea322c4104fef59c9fad0ba4ba386a93700 commit 53936ea322c4104fef59c9fad0ba4ba386a93700 Author: Sherry Yang <sherryy@android.com> Date: Mon Oct 16 23:11:41 2017 BACKPORT: android: binder: drop lru lock in isolate callback Drop the global lru lock in isolate callback before calling zap_page_range which calls cond_resched, and re-acquire the global lru lock before returning. Also change return code to LRU_REMOVED_RETRY. Use mmput_async when fail to acquire mmap sem in an atomic context. Fix "BUG: sleeping function called from invalid context" errors when CONFIG_DEBUG_ATOMIC_SLEEP is enabled. Also restore mmput_async, which was initially introduced in ec8d7c14e ("mm, oom_reaper: do not mmput synchronously from the oom reaper context"), and was removed in 212925802 ("mm: oom: let oom_reap_task and exit_mmap run concurrently"). BUG= chromium:767096 TEST=Build and run Change-Id: I3648ba112b32c348dbbc3aebe99d24bf70386395 Link: http://lkml.kernel.org/r/20170914182231.90908-1-sherryy@android.com Fixes: f2517eb76f1f2 ("android: binder: Add global lru shrinker to binder") Signed-off-by: Sherry Yang <sherryy@android.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reported-by: Kyle Yan <kyan@codeaurora.org> Acked-by: Arve Hjnnevg <arve@android.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Martijn Coenen <maco@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Riley Andrews <riandrews@android.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hdanton@sina.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hoeun Ryu <hoeun.ryu@gmail.com> Cc: Christopher Lameter <cl@linux.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> [backport: The restored functions were never removed in v4.12] Signed-off-by: Guenter Roeck <groeck@chromium.org> (cherry picked from commit a1b2289cef92) Reviewed-on: https://chromium-review.googlesource.com/675570 Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/53936ea322c4104fef59c9fad0ba4ba386a93700/drivers/android/binder_alloc.c
,
Oct 16 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/237e41d476f57c6ba8600357c7d5f8727b49687a commit 237e41d476f57c6ba8600357c7d5f8727b49687a Author: Sherry Yang <sherryy@android.com> Date: Mon Oct 16 23:11:42 2017 FROMLIST: android: binder: Remove unused vma argument (from https://patchwork.kernel.org/patch/9954123/) The vma argument in update_page_range is no longer used after 74310e06 ("android: binder: Move buffer out of area shared with user space"), since mmap_handler no longer calls update_page_range with a vma. Test: ran binderLibTest, throughputtest, interfacetest and mempressure w/lockdep Bug: b:36007193, chromium:767096 Change-Id: Ibd6f24c11750f8f7e6ed56e40dd18c08e02ace25 Acked-by: Arve Hjnnevg <arve@android.com> Signed-off-by: Sherry Yang <sherryy@android.com> (cherry picked from commit edd2131714af4ece5cb61afd27e2ce7fc9e0906a) Signed-off-by: Guenter Roeck <groeck@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/718484 Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/237e41d476f57c6ba8600357c7d5f8727b49687a/drivers/android/binder_alloc.c
,
Nov 17 2017
Fixed in chromeos-4.14. WontFix in chromeos-4.12. |
||||
►
Sign in to add a comment |
||||
Comment 1 by groeck@chromium.org
, Sep 20 2017Call sequence: zap_page_range -> unmap_page_range -> zap_p4d_range -> zap_pud_range -> cond_resched -> ___might_sleep And: binder_shrink_scan -> list_lru_walk_node -> __list_lru_walk_one -> spin_lock() -> binder_alloc_free_page() [ with spinlock active ] The context as well as the spinlock passed to binder_alloc_free_page() suggests that the spinlock should be released prior to calling zap_page_range(), but I don't know the code well enough to be sure.