New issue
Advanced search Search tips

Issue 667899 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Debuggerd in disk-sleep state during engrave_tombstone

Project Member Reported by cernekee@chromium.org, Nov 22 2016

Issue description

We are seeing lots of crashes similar to https://code.google.com/p/android/issues/detail?id=69107 on minnie in the hung_tasks bucket:

https://goto.google.com/gzjul

75c1c77700000000 is one such report.

The backtrace looks like:

<3>[16188.280328] Freezing of tasks failed after 20.004 seconds (1 tasks refusing to freeze, wq_busy=0):
<6>[16188.280774] debuggerd D c06e2f4c 0 3622 3211 0x00000001
<5>[16188.281024] [<c06e2f4c>] (__schedule) from [<c06e32a8>] (schedule+0xa4/0xa8)
<5>[16188.281171] [<c06e32a8>] (schedule) from [<c012d114>] (ptrace_trapping_sleep_fn+0x18/0x20)
<5>[16188.281316] [<c012d114>] (ptrace_trapping_sleep_fn) from [<c06e383c>] (__wait_on_bit+0x64/0xb4)
<5>[16188.281461] [<c06e383c>] (__wait_on_bit) from [<c06e391c>] (out_of_line_wait_on_bit+0x90/0xb4)
<5>[16188.281596] [<c06e391c>] (out_of_line_wait_on_bit) from [<c012dad8>] (SyS_ptrace+0x2c4/0x508)
<5>[16188.281733] [<c012dad8>] (SyS_ptrace) from [<c0106460>] (ret_fast_syscall+0x0/0x30)


This may be a partial fix.  It's a very simple change -- do you think it is worth applying?

commit 7c3b00e06d731a28fc3d17ed02ba250642b15b81
Author: Oleg Nesterov <oleg@redhat.com>
Date:   Wed Jan 20 14:59:55 2016 -0800

    ptrace: make wait_on_bit(JOBCTL_TRAPPING_BIT) in ptrace_attach() killable
    
    ptrace_attach() can hang waiting for STOPPED -> TRACED transition if the
    tracee gets frozen in between, change wait_on_bit() to use TASK_KILLABLE.
    
    This doesn't really solve the problem(s) and we probably need to fix the
    freezer.  In particular, note that this means that pm freezer will fail if
    it races attach-to-stopped-task.
    
    And otoh perhaps we can just remove JOBCTL_TRAPPING_BIT altogether, it is
    not clear if we really need to hide this transition from debugger, WNOHANG
    after PTRACE_ATTACH can fail anyway if it races with SIGCONT.


Another option is to add a crash bucket that matches "Freezing of tasks failed after" or maybe "SyS_ptrace".
 

Comment 1 by dgreid@chromium.org, Nov 22 2016

Seems like if would be worth back-porting to me.  We'll pick it up in 4.9 eventually anyways might as well have all the kernels behave the same.
Project Member

Comment 2 by bugdroid1@chromium.org, Nov 23 2016

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/fa8a4fe6a7bc677b6bde6721007eb242d6c000f0

commit fa8a4fe6a7bc677b6bde6721007eb242d6c000f0
Author: Oleg Nesterov <oleg@redhat.com>
Date: Wed Jan 20 22:59:55 2016

BACKPORT: ptrace: make wait_on_bit(JOBCTL_TRAPPING_BIT) in ptrace_attach() killable

ptrace_attach() can hang waiting for STOPPED -> TRACED transition if the
tracee gets frozen in between, change wait_on_bit() to use TASK_KILLABLE.

This doesn't really solve the problem(s) and we probably need to fix the
freezer.  In particular, note that this means that pm freezer will fail if
it races attach-to-stopped-task.

And otoh perhaps we can just remove JOBCTL_TRAPPING_BIT altogether, it is
not clear if we really need to hide this transition from debugger, WNOHANG
after PTRACE_ATTACH can fail anyway if it races with SIGCONT.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Roland McGrath <roland@hack.frob.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 7c3b00e06d731a28fc3d17ed02ba250642b15b81)
(backport: fixed up conflict due to wait_on_bit function signature change)

BUG=chromium:667899
TEST=suspend and resume a couple of times on minnie

Change-Id: I85b0da4456623891e174a642417c5f40253f5c36
Reviewed-on: https://chromium-review.googlesource.com/413690
Commit-Ready: Kevin Cernekee <cernekee@chromium.org>
Tested-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Dylan Reid <dgreid@chromium.org>

[modify] https://crrev.com/fa8a4fe6a7bc677b6bde6721007eb242d6c000f0/kernel/ptrace.c

Comment 3 by yoshi@chromium.org, Jan 18 2018

Cc: -yoshiat@chromium.org

Sign in to add a comment