New issue
Advanced search Search tips

Issue 819228 link

Starred by 6 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Bug

Blocking:
issue 817314



Sign in to add a comment

Consider using posix_spawn() on Linux

Project Member Reported by lizeb@chromium.org, Mar 6 2018

Issue description

In https://crbug.com/817314, one possibility is that fork() is slow on long-running browser processes on Linux, and blocking a large part of the browser across processes as a consequence.

In this instance, we want to start an external process, so posix_spawn() would be more efficient.

We should consider extending support of posix_spawn() to Linux (it's already enabled on OS X).
 

Comment 1 by lizeb@chromium.org, Mar 6 2018

Forking seems slow on Linux with a lot of VMAs.
Chrome has a lot of VMAs, for instance my browser process currently has ~1k VMAs, and a single renderer (with GMail) ~4k.


The attached C program allocates N VMAs and reports how long fork() took. The first argument is the number of VMAs. This is run on a z620 with gLinux. There each VMA is touched once, but no processing is done after fork(), and the child calls _exit() right away.

$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 1 1 
76.58us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 10 1
109.18us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 100 1
988.77us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 1000 1
12296.03us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 10000 1
84918.44us to fork().


84ms to fork() with 10k VMAs.


Now, if we use vfork() instead:
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 1 1
153.92us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 10 1
128.37us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 100 1
196.63us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 1000 1
108.06us to fork().
$ clang slow_fork.c -o slow_fork -O3 && ./slow_fork 10000 1
204.32us to fork().

So the culprit really is the mm copy and CoW machinery in the kernel.
slow_fork.c
1.9 KB View Download
Components: Internals>Core
Two things to note about using posix_spawn:

1) There's a PreExecDelegate option that is currently needed for part of the Linux sandbox setup: https://cs.chromium.org/chromium/src/sandbox/linux/services/namespace_sandbox.cc?type=cs&q=preexecdelegate&sq=package:chromium&l=41

2) The only case on Mac when we don't use posix_spawn is when the LaunchOptions.current_directory. There's no posix_spawn option to cd, so in fork that happens pre-exec.

Comment 3 Deleted

Hmm. I guess that means (to my surprise) it does not matter that we use a zygote? i.e. all of our forks() are coming from a process which itself was fork()ed from the browser very early. I wouldn't expect future additions to the browser process VMAs to affect the zygote process, but then I'm fairly ignorant about kernel behavior.

Comment 5 by lizeb@chromium.org, Mar 6 2018

There seem to be a few places from which base::LaunchProcess is called. AFAICT, not all of them use the zygote, and in these cases, I would expect the VMA count of the parent process to matter.
For the zygote case, it should not.

For instance, https://cs.chromium.org/chromium/src/chrome/browser/extensions/api/messaging/native_process_launcher_posix.cc?type=cs&sq=package:chromium&l=53
https://cs.chromium.org/chromium/src/chrome/browser/service_process/service_process_control.cc?type=cs&sq=package:chromium&l=386
https://cs.chromium.org/chromium/src/services/service_manager/runner/host/service_process_launcher.cc?type=cs&sq=package:chromium&l=182

@rockot: Does the service_manager one go through the zygote? Doesn't seem to be the case, but I'm not really familiar with this code.

@rsesek: I think that the jank we see are for processes that don't start from the zygote, in some of these casese we just want to launch an external process, so posix_spawn() may be usable. Will try to use the OS X code path (with the same restrictions), and see whether that helps.

Comment 6 by lizeb@chromium.org, Mar 7 2018

This is not restricted to gLinux, this is from another machine, non-Google, running Ubuntu 17.10:

$ uname -a
Linux REDACTED 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:14:41 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ cat /proc/cpuinfo | grep Xeon | head -1
model name	: Intel(R) Xeon(R) CPU E3-1240 V2 @ 3.40GHz
$ gcc -O3 slow_malloc.c -o slow_malloc && ./slow_malloc 1 1
49.02us to fork().
$ gcc -O3 slow_malloc.c -o slow_malloc && ./slow_malloc 10 1
51.50us to fork().
$ gcc -O3 slow_malloc.c -o slow_malloc && ./slow_malloc 100 1
767.92us to fork().
$ gcc -O3 slow_malloc.c -o slow_malloc && ./slow_malloc 1000 1
7722.15us to fork().
$ gcc -O3 slow_malloc.c -o slow_malloc && ./slow_malloc 10000 1
50978.53us to fork().

Comment 7 by olka@chromium.org, Mar 7 2018

Is ChromeOS also affected?
Should it be a P1/release blocker?
Components: Internals>Media>Audio
Labels: -Pri-2 Pri-1
Should definitely be P1. There are quite a few reports internally at b/73927266
Project Member

Comment 9 by bugdroid1@chromium.org, Mar 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/03a809d90193d7142b5e3e8e523d7bc30136a964

commit 03a809d90193d7142b5e3e8e523d7bc30136a964
Author: Benoit Lize <lizeb@chromium.org>
Date: Thu Mar 15 11:53:24 2018

base/linux: Add a histogram for fork()ing time.

Bug: 819228
Change-Id: I5813d2c77c04d7dd00303a874ba6bd2a09acf946
Reviewed-on: https://chromium-review.googlesource.com/952967
Reviewed-by: Robert Kaplow <rkaplow@chromium.org>
Reviewed-by: Lei Zhang <thestig@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#543347}
[modify] https://crrev.com/03a809d90193d7142b5e3e8e523d7bc30136a964/base/process/launch_posix.cc
[modify] https://crrev.com/03a809d90193d7142b5e3e8e523d7bc30136a964/tools/metrics/histograms/histograms.xml

Things are definitely slow in the long tail. 99%ile on Chrome OS is >100 ms.

Any updates here?
Labels: M-70
Cc: thestig@chromium.org thomasanderson@chromium.org
 Issue 869690  has been merged into this issue.
I don't know if lizeb@ is working on this. @thomasanderson was this something you were working on? (per merged bug)
Labels: -Pri-1 Pri-2
Dropping to p2 since this sailed a long time ago.
> @thomasanderson was this something you were working on?

Not at the moment, but feel free to assign to me if no one else is working on it.
Cc: -thomasanderson@chromium.org lizeb@chromium.org
Owner: thomasanderson@chromium.org

Sign in to add a comment