Sampling now is expensive on android |
|||||
Issue descriptionThe SequenceManager is slower than expected on android so I profiled SequenceManagerPerfTest.RunTenThousandImmediateTasks_OneQueue/3 on a Pixel 2. It turns put that ~26% of time is spent in LazyNow which is very surprising. To see the callgraph run: pprof -http localhost:1234 now_is_slow In the callstack we can see symbol __clock_gettime taking up a lot of time, so it seems the VDSO set up by [1] isn't being used. This is strange and I'm not sure how to diagnose why. 1) https://android.googlesource.com/platform/bionic/+/master/libc/bionic/vdso.cpp#78
,
Oct 30
Is your target arm or arm64?
,
Oct 30
Looking at the android code, it seems vdso only works for 64 bit builds, see [1]. However building base_perftests with target_cpu = "arm64" doesn't help. 1) https://cs.corp.google.com/aosp-lollipop/bionic/libc/bionic/vdso.cpp #if defined(__aarch64__) #define VDSO_CLOCK_GETTIME_SYMBOL "__kernel_clock_gettime" #define VDSO_GETTIMEOFDAY_SYMBOL "__kernel_gettimeofday" #elif defined(__x86_64__) #define VDSO_CLOCK_GETTIME_SYMBOL "__vdso_clock_gettime" #define VDSO_GETTIMEOFDAY_SYMBOL "__vdso_gettimeofday" #endif
,
Oct 30
CLOCK_MONOTONIC vs CLOCK_MONOTONIC_RAW doesn't seem to make any difference either.
,
Oct 30
Looks like compat vdso for arm32 apps on an arm64 hasn't yet landed in the upstream kernel[1], but Android kernels might have that functionality patched in[2][3]. It might make sense for Chrome to call the vDSO directly if it's there. (Choice quote from above: "This patch series' above has been applied to the latest Pixel phones and resulted in a 0.4% battery improvement.") [1] https://lkml.org/lkml/2018/6/18/1000 [2] https://www.spinics.net/lists/arm-kernel/msg539060.html [3] https://blog.linuxplumbersconf.org/2016/ocw/system/presentations/3711/original/LPC_vDSO.pdf
,
Oct 30
I just checked that the Chrome browser process gets a vdso mapping on the Pixel 3, but I'm not sure whether or not it has the clock in it or not: /proc/19934 # cat maps|grep vdso f6f5b000-f6f5d000 r-xp 00000000 00:00 0 [vdso]
,
Oct 30
,
Oct 31
,
Oct 31
Alright, I did some digging here and I think this is no longer an issue: 1) Bionic has support for vdso32 on arm64 since ~N and Linux 4.1: b/19198045 (although see also b/20045882). 2) I made a small test program that calls clock_gettime() in a loop and checked that it gets the vdso treatment both in arm32 and arm64 mode: ARM32 ===== Overhead Command Pid Tid Shared Object Symbol 90.74% ./clock_gettime32 28618 28618 [vdso] __kernel_clock_gettime 4.48% ./clock_gettime32 28618 28618 /system/lib/libc.so clock_gettime 0.82% ./clock_gettime32 28618 28618 /data/local/tmp/clock_gettime32 clock_gettime32[+60a] ARM64 ===== Overhead Command Pid Tid Shared Object Symbol 80.79% ./clock_gettime 8349 8349 [vdso] __kernel_clock_gettime 8.07% ./clock_gettime 8349 8349 /data/local/tmp/clock_gettime main 6.25% ./clock_gettime 8349 8349 /system/lib64/libc.so clock_gettime I tested this on a Pixel 2 running Android ~P. 3) I re-ran the benchmark where alexclarke@ saw this problem, and clock_gettime() basically doesn't show up. See the attached pprof profile. I think what happened is that the device we were originally using didn't have vdso32 enabled for some reason. As far as I can tell it should be there on modern devices and Chrome should be able to use it automatically thanks to bionic. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by skyos...@chromium.org
, Oct 30