hung_task in zap_pid_ns_processes when running stress test |
||||||||||||||
Issue descriptionfeedback crash report: https://feedback.corp.google.com/#/Report/59423512539 repro step is simple: 1. on chromeos 9523.0.0, running telemetry perf benchmark, memory stress test ~/chromium/src/tools/perf$ ./run_benchmark --browser=cros-chrome --remote=DUT_IP tab_switching.typical_25 --also-run-disabled-tests --tabset-repeat=5 2. about 20% chance caused a hung task timeout partial crash log: <3>[ 480.133578] INFO: task TaskSchedulerRe:4997 blocked for more than 120 seconds. <3>[ 480.133591] Tainted: G W 3.18.0 #5 <3>[ 480.133596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <6>[ 480.133605] TaskSchedulerRe D ffff880173456998 0 4997 4766 0x00000102 <4>[ 480.133617] ffff880126c9bc78 0000000000000046 0000000000000000 ffff880065f31240 <4>[ 480.133628] ffff880126c9bfd8 ffff880173456460 0000000000013500 ffff880173456460 <4>[ 480.133640] ffff8800fffffff6 0000000000000002 ffff880075cf19e0 ffff880173456460 <4>[ 480.133653] Call Trace: <4>[ 480.133668] [<ffffffff8a49b572>] schedule+0x69/0x6b <4>[ 480.133677] [<ffffffff89ed38eb>] zap_pid_ns_processes+0x133/0x16f <4>[ 480.133688] [<ffffffff89e64faa>] do_exit+0x51f/0x964 <4>[ 480.133697] [<ffffffff89e662b5>] do_group_exit+0x43/0xb1 <4>[ 480.133706] [<ffffffff89e7056d>] get_signal+0x573/0x5a5 <4>[ 480.133715] [<ffffffff89e02005>] do_signal+0x37/0x5f8 <4>[ 480.133723] [<ffffffff89ebfd21>] ? do_futex+0xc6/0x7de <4>[ 480.133732] [<ffffffff89edf567>] ? __seccomp_phase1_filter+0x2f/0x1bc <4>[ 480.133742] [<ffffffff89e025f2>] do_notify_resume+0x2c/0x61 <4>[ 480.133750] [<ffffffff8a49eb13>] int_signal+0x12/0x17 also reproducible on kevin, cave. similar issue: https://bugs.chromium.org/p/chromium/issues/detail?id=660628 ---- the attached c program can trigger zap_pid_ns_processes hung_task (5 in 5 tests) on cave, kevin. it create a process with a new pid namespace, create a thread, and then ptrace the 2 thread in the process. kernel would not autoreap a ptraced thread's pid, and when the last thread in the pid namespace exit, it wait for other thread to reap pid indefinitely. the c program can also cause hung_task (2 in 5 tests) in virtual box ubuntu 17.04 with kernel 4.10.0.
,
May 11 2017
#1: This is great, thanks!
,
May 11 2017
Reproduces every time with v4.10.2 upstream kernel after changing the sleep in main() to sleep 2 seconds.
,
May 11 2017
Also v4.11.0: [<ffffffff811c385e>] zap_pid_ns_processes+0x1ee/0x2a0 [<ffffffff810c1ba4>] do_exit+0x10d4/0x1330 [<ffffffff810c1ee6>] do_group_exit+0x86/0x130 [<ffffffff810d4347>] get_signal+0x367/0x8a0 [<ffffffff81046e73>] do_signal+0x83/0xb90 [<ffffffff81004475>] exit_to_usermode_loop+0x75/0xc0 [<ffffffff810055b6>] syscall_return_slowpath+0xc6/0xd0 [<ffffffff81ced488>] entry_SYSCALL_64_fastpath+0xab/0xad [<ffffffffffffffff>] 0xffffffffffffffff
,
May 11 2017
Question from upstream (Eric Biederman): > My question is does the test case cover the real life cause of this? > I assume it is simplified from what you are seeing on chromeOS. > > More specifically the question is: Is it a process doing the > PTRACE_TRACEME? A processing created after setns? Or something else. @vovoy, can you respond ?
,
May 11 2017
,
May 11 2017
,
May 11 2017
The testcase is really simple -- it just opens 100 tabs. I think that's a very high number of tabs, but not a unreasonable test that shouldn't fall over or cause kernel panics.
,
May 11 2017
#8: Eric did not question the validity of the benchmark. His question was if the C code used to reproduce the problem matches the testcase. In other words, he is asking if chrome uses PTRACE_TRACEME.
,
May 11 2017
re #9 -- oh I see, sorry I'm not sure but FWIW I grepped the chrome source and it seems like it doesn't use PTRACE_TRACEME except in some unit tests
,
May 11 2017
#10: Ok, back to @vovoy. We'll have to understand how to get from our test case to the test code. It won't help if Eric finds a fix specifically for the PTRACE_TRACEME problem but the real issue is still there.
,
May 12 2017
The test case is a simplification of what I see in the ChromeOs tab switching stress test. when tab switching caused hung_task, I observe the hung task is a new pid ns chrome with many ptraced threads. every task_struct's ptrace field was set to 1 in the hung chrome process. When tab switching caused hung_task, perf record shows a large number of ptrace syscall (about 18000 ptrace syscall), but I didn't check the argument of ptrace. PTRACE_TRACEME is just a convenient way to ptrace every thread.
,
May 12 2017
#12: Are the ptrace syscalls a result of the test environment (eg a result of running perf), or does chrome itself initiate them as part of its normal operation ?
,
May 12 2017
the hung task was ptrace by another chrome process. I will check what happened.
,
May 12 2017
The analysis to check what happened in zap_pid_ns_processes hung_task 1. add kernel printk logs ( https://chromium-review.googlesource.com/c/503910/ ) 2. run tab switching benchmark 3. # dmesg > dmesg.txt 4. dmesg custom log shows: [ 174.158733] ptrace: cur_pid: 9971, req: 16, pid: 10054, addr: 0, data: 0 [ 174.164142] ptrace: cur_pid: 9971, req: 12, pid: 10054, addr: 0, data: 128472863565416 16: PTRACE_ATTACH, 12: PTRACE_GETREGS 5. the thread calling ptrace is BrowserBlocking # cat /proc/9971/status Name: BrowserBlocking State: S (sleeping) Tgid: 9929 Ngid: 0 Pid: 9971 ... # cat /proc/9929/status Name: chrome State: S (sleeping) Tgid: 9929 Ngid: 0 Pid: 9929 ... 6. search chromium, the only file including both PTRACE_ATTACH and PTRACE_GETREGS is breakpad minidump ( https://cs.chromium.org/chromium/src/breakpad/src/client/linux/minidump_writer/linux_ptrace_dumper.cc?rcl=47e51739fd00badbceba5bc26b8abc8bbd530989&l=85 )
,
May 12 2017
,
May 12 2017
remove internal device name
,
May 12 2017
@vovoy: Can you repeat the test with the remedy suggested by Eric (Use TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE in zap_pid_ns_processes) ?
,
May 12 2017
What's happening in tab switching benchmark: 1. A chrome renderer process was created with CLONE_NEWPID 2. The process crashed 3. Chrome breakpad service calls ptrace(PTRACE_ATTACH, ..) to attach to every threads of the crashed process to dump info 4. When breakpad detach the crashed process, the crashed process stuck in zap_pid_ns_processes()
,
May 12 2017
re #18: OK, I will try the benchmark with TASK_INTERRUPTIBLE in zap_pid_ns_processes.
,
May 12 2017
the patch works:
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index 77403c157157..971e7bc6939b 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -277,7 +277,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
* if reparented.
*/
for (;;) {
- set_current_state(TASK_UNINTERRUPTIBLE);
+ set_current_state(TASK_INTERRUPTIBLE);
if (pid_ns->nr_hashed == init_pids)
break;
schedule();
When running tab switching, the new pid, ptraced process still stuck in zap_pid_ns_processes().
But it's in S (Sleeping) state instead of D (disk sleep) and does not trigger hung_task.
When chrome is closed, the process will be freed.
,
May 12 2017
@vovoy: Thanks a lot!
,
May 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/8d3c47f143459e41cb73ce3b5dbb73f622e8e099 commit 8d3c47f143459e41cb73ce3b5dbb73f622e8e099 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Sat May 13 01:46:17 2017 FROMLIST: pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes The code can potentially sleep for an indefinite amount of time in zap_pid_ns_processes triggering the hung task timeout, and increasing the system average. This is undesirable. Sleep with a task state of TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these undesirable side effects. Apparently under heavy load this has been allowing Chrome to trigger the hung time task timeout error and cause ChromeOS to reboot. BUG= chromium:721298 chromium:660628 TEST=Run test application attached to chromium:721298 Change-Id: I80547dfae36ff1db3a4a508838e50a85efa981ca Reported-by: Vovo Yang <vovoy@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 6347e9009104 ("pidns: guarantee that the pidns init will be the last pidns process reaped") Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Guenter Roeck <groeck@chromium.org> (am from https://patchwork.kernel.org/patch/9724743/) Reviewed-on: https://chromium-review.googlesource.com/505193 [modify] https://crrev.com/8d3c47f143459e41cb73ce3b5dbb73f622e8e099/kernel/pid_namespace.c
,
May 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/0af594bde2f90bbe6a738643c84af07aee190937 commit 0af594bde2f90bbe6a738643c84af07aee190937 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Sat May 13 01:46:16 2017 FROMLIST: pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes The code can potentially sleep for an indefinite amount of time in zap_pid_ns_processes triggering the hung task timeout, and increasing the system average. This is undesirable. Sleep with a task state of TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these undesirable side effects. Apparently under heavy load this has been allowing Chrome to trigger the hung time task timeout error and cause ChromeOS to reboot. BUG= chromium:721298 chromium:660628 TEST=Run test application attached to chromium:721298 Change-Id: I80547dfae36ff1db3a4a508838e50a85efa981ca Reported-by: Vovo Yang <vovoy@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 6347e9009104 ("pidns: guarantee that the pidns init will be the last pidns process reaped") Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Guenter Roeck <groeck@chromium.org> (am from https://patchwork.kernel.org/patch/9724743/) Reviewed-on: https://chromium-review.googlesource.com/505194 [modify] https://crrev.com/0af594bde2f90bbe6a738643c84af07aee190937/kernel/pid_namespace.c
,
May 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/ff5ffb70d281fa7addd2ee2de6d6fa8c7c4f6747 commit ff5ffb70d281fa7addd2ee2de6d6fa8c7c4f6747 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Sat May 13 05:49:03 2017 FROMLIST: pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes The code can potentially sleep for an indefinite amount of time in zap_pid_ns_processes triggering the hung task timeout, and increasing the system average. This is undesirable. Sleep with a task state of TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these undesirable side effects. Apparently under heavy load this has been allowing Chrome to trigger the hung time task timeout error and cause ChromeOS to reboot. BUG= chromium:721298 chromium:660628 TEST=Run test application attached to chromium:721298 Change-Id: I80547dfae36ff1db3a4a508838e50a85efa981ca Reported-by: Vovo Yang <vovoy@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 6347e9009104 ("pidns: guarantee that the pidns init will be the last pidns process reaped") Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Guenter Roeck <groeck@chromium.org> (am from https://patchwork.kernel.org/patch/9724743/) Reviewed-on: https://chromium-review.googlesource.com/505149 Reviewed-by: Douglas Anderson <dianders@chromium.org> [modify] https://crrev.com/ff5ffb70d281fa7addd2ee2de6d6fa8c7c4f6747/kernel/pid_namespace.c
,
May 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/9497031b53a70b69526a4c276a3974f901e1cce1 commit 9497031b53a70b69526a4c276a3974f901e1cce1 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Sat May 13 08:04:44 2017 FROMLIST: pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes The code can potentially sleep for an indefinite amount of time in zap_pid_ns_processes triggering the hung task timeout, and increasing the system average. This is undesirable. Sleep with a task state of TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these undesirable side effects. Apparently under heavy load this has been allowing Chrome to trigger the hung time task timeout error and cause ChromeOS to reboot. BUG= chromium:721298 chromium:660628 TEST=Run test application attached to chromium:721298 Change-Id: I80547dfae36ff1db3a4a508838e50a85efa981ca Reported-by: Vovo Yang <vovoy@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 6347e9009104 ("pidns: guarantee that the pidns init will be the last pidns process reaped") Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Guenter Roeck <groeck@chromium.org> (am from https://patchwork.kernel.org/patch/9724743/) Reviewed-on: https://chromium-review.googlesource.com/505706 [modify] https://crrev.com/9497031b53a70b69526a4c276a3974f901e1cce1/kernel/pid_namespace.c
,
May 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/10f175e80cdb29c2c7549fb9dbef4bd06c9db517 commit 10f175e80cdb29c2c7549fb9dbef4bd06c9db517 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Sat May 13 08:04:45 2017 FROMLIST: pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes The code can potentially sleep for an indefinite amount of time in zap_pid_ns_processes triggering the hung task timeout, and increasing the system average. This is undesirable. Sleep with a task state of TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these undesirable side effects. Apparently under heavy load this has been allowing Chrome to trigger the hung time task timeout error and cause ChromeOS to reboot. BUG= chromium:721298 chromium:660628 TEST=Run test application attached to chromium:721298 Change-Id: I80547dfae36ff1db3a4a508838e50a85efa981ca Reported-by: Vovo Yang <vovoy@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 6347e9009104 ("pidns: guarantee that the pidns init will be the last pidns process reaped") Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Guenter Roeck <groeck@chromium.org> (am from https://patchwork.kernel.org/patch/9724743/) Reviewed-on: https://chromium-review.googlesource.com/505727 [modify] https://crrev.com/10f175e80cdb29c2c7549fb9dbef4bd06c9db517/kernel/pid_namespace.c
,
May 13 2017
The applied fix gets rid of the hung tasks, but leaves a zombie process behind. Eric suggests the following: "I suspect what is happening is that the thread breakpad is attached to is killed by zap_pid_ns_processes. At which point PTRACE_DETACH won't work because the thread has been killed and waitpid(thread_pid,...) needs to be used instead. Certainly that is what is happening in the reproducer. So I suspect breakpad just needs a waitpid whend PTRACE_DETACH fails or possible just a general loop that listens for SIGCHLD and calls some variant of wait." A test running the reproducer without CLONE_NEWPID leaves the child process in Zombie state, suggesting the above makes sense. I'll file a bug against Chrome Monday.
,
May 13 2017
,
May 15 2017
I experimented according to Eric's suggestion. Since traced children doesn't auto-reap, signal(SIGCHLD, SIG_IGN) doesn't work in this case. (ref: https://github.com/torvalds/linux/commit/7ed0175a462c4c30f6df6fac1cccac058f997739) An example pstree when I run tab switching: 8979 is the process stuck in zap_pid_ns_processes() 8952 is the process breakpad belongs to. ---chrome(8892)-+-chrome(8898)-+-chrome(8911)-+-chrome(8979)-+-{chrome}(8982) | | |-... | | `-{chrome}(9076) | `--other chrome renderer processes... `-{chrome}(8952)(breakpad) As I know, only the parent process can call waitpid() to free a zombie child. (when I test, tracer calling waitpid() return the child pid but doesn't free the zombie, see attach) I think the parent chrome process (8911 in the above example) has to call waitpid() to avoid traced CLONE_NEWPID children sticking in zap_pid_ns_processes().
,
May 16 2017
Any idea why 8911 is the parent of the non-reaped process ? That must be one of 8982, 9076, ... I would have assumed that the parent is 8979. Do you know if 8911 and/or 8979 call signal(SIGCHLD, SIG_IGN) ?
,
May 17 2017
I missed some detail. 8979, 8982, 9076 are threads of the same process that stuck in zap_pid_ns_processes() (pstree -p) when I run tab switching test, no process calls signal(SIGCHLD, *)
,
May 23 2017
Upstream commit 3fd3722621662 ("pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes()") may be related.
,
Aug 1 2017
,
Jan 22 2018
|
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 Deleted