New issue
Advanced search Search tips

Issue 776702 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug

Blocking:
issue 758566



Sign in to add a comment

Lower the cost of function instrumentation in Chrome on Android

Project Member Reported by lizeb@chromium.org, Oct 20 2017

Issue description

Current cost is .6-1us per function call, including inlined functions.
See https://docs.google.com/document/d/14md0BA_khTqdfGHS_IpAofuvaNsG__akAA5LBM8MgOk/edit#
for details.

Lowering this cost is required to build better profiles and shrink PSS on Android.
 

Comment 1 by pasko@chromium.org, Oct 20 2017

can we make a public version of the doc?
Project Member

Comment 2 by bugdroid1@chromium.org, Oct 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f4a81f61f3ddddfb36701a674d1d0ace4bd38e3c

commit f4a81f61f3ddddfb36701a674d1d0ace4bd38e3c
Author: Benoit Lize <lizeb@chromium.org>
Date: Wed Oct 25 08:28:33 2017

android: Experimental lightweight function instrumentation.

Function instrumentation is used to create an orderfile with functions
reached during startup. The high cost of the instrumentation lowers its
usefulness.
This is a new experimental instrumentation with lower overhead. It is
meant to be removed or to replace the current instrumentation.

This instrumentation essentially replaces a per-thread hash table lookup
per function call (plus clock_gettime()) with a shared bitfield of
reached locations.

It introduces a new GN argument, as the switch cannot be made after
compile time (since the instrumentation is inserted at that stage). This
code is not used in shipped builds (and neither is the current
instrumentation).

Co-authors: {mattcary, pasko}

Bug:  776702 
Change-Id: I891891d69b342bb1e97e3f2b84e99f28cf0326e9
Reviewed-on: https://chromium-review.googlesource.com/730745
Reviewed-by: Egor Pasko <pasko@chromium.org>
Reviewed-by: agrieve <agrieve@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#511405}
[modify] https://crrev.com/f4a81f61f3ddddfb36701a674d1d0ace4bd38e3c/build/config/android/config.gni
[modify] https://crrev.com/f4a81f61f3ddddfb36701a674d1d0ace4bd38e3c/tools/cygprofile/BUILD.gn
[add] https://crrev.com/f4a81f61f3ddddfb36701a674d1d0ace4bd38e3c/tools/cygprofile/lightweight_cygprofile.cc

Project Member

Comment 3 by bugdroid1@chromium.org, Oct 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/52d1aee985358a292a45a97e27c0e2cb3b9f77fa

commit 52d1aee985358a292a45a97e27c0e2cb3b9f77fa
Author: Benoit Lize <lizeb@chromium.org>
Date: Fri Oct 27 09:30:37 2017

android: Process dumps for the experimental instrumentation.

From a list of dumps (one per process) from an instrumented build,
writes the list of reached symbols in a regular build.

Bug:  776702 
Change-Id: I878be5e174ffa96c50d24107fa26ce7359632164
Reviewed-on: https://chromium-review.googlesource.com/738131
Commit-Queue: Benoit L <lizeb@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#512133}
[add] https://crrev.com/52d1aee985358a292a45a97e27c0e2cb3b9f77fa/tools/cygprofile/process_profiles.py
[add] https://crrev.com/52d1aee985358a292a45a97e27c0e2cb3b9f77fa/tools/cygprofile/process_profiles_unittest.py

Project Member

Comment 4 by bugdroid1@chromium.org, Oct 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b2a97cd6c9f86e2f1fc93966bd83539e193dd94c

commit b2a97cd6c9f86e2f1fc93966bd83539e193dd94c
Author: Benoit Lize <lizeb@chromium.org>
Date: Fri Oct 27 13:19:34 2017

android: Remove unnecessary instrumentation calls.

The -finstrument-function compiler flag adds a call to
__cyg_profile_func_{enter,exit}() at the start/end of each
function. Calls are inserted before inlining, leading to duplicate
ones. Besides, in Chrome the exit instrumentation function is empty.

This scripts finds the unnecessary instrumentation calls in the native
library, and replaces them with NOPs.

Bug:  776702 
Change-Id: Ia187079c1327512f689a313477ce1989a47ef84a
Reviewed-on: https://chromium-review.googlesource.com/732994
Commit-Queue: Benoit L <lizeb@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#512164}
[add] https://crrev.com/b2a97cd6c9f86e2f1fc93966bd83539e193dd94c/tools/cygprofile/patch_native_library.py
[add] https://crrev.com/b2a97cd6c9f86e2f1fc93966bd83539e193dd94c/tools/cygprofile/patch_native_library_unittest.py

Project Member

Comment 5 by bugdroid1@chromium.org, Nov 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/10f725e8a6c163e975618ed7f8049f019ac5e976

commit 10f725e8a6c163e975618ed7f8049f019ac5e976
Author: Benoit Lize <lizeb@chromium.org>
Date: Mon Nov 06 14:58:11 2017

android: Improve the lightweight instrumentation.

Adds two changes to the lightweight instrumentation:
- Strict bound checking for return addresses
- Supports mcount() instrumentation (with -pg)

Strict bound checks is possible because the linker looks for exact
matches in the orderfile before considering the globs. If this
assumption doesn't hold true, the code will hit a CHECK(). This also
makes sure that the orderfiles are constructed with the right anchor sections.

mcount() instrumentation is enabled by -pg, and is lighter weight than
the cyg_enter/exit() based one. This changes allows to support both at
the same time.

Bug:  776702 
Change-Id: I3496e23cfa7977a81d26ce296c4cb28c1f70f96d
Reviewed-on: https://chromium-review.googlesource.com/753882
Commit-Queue: Benoit L <lizeb@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Cr-Commit-Position: refs/heads/master@{#514128}
[modify] https://crrev.com/10f725e8a6c163e975618ed7f8049f019ac5e976/tools/cygprofile/lightweight_cygprofile.cc
[modify] https://crrev.com/10f725e8a6c163e975618ed7f8049f019ac5e976/tools/cygprofile/patch_orderfile.py

Project Member

Comment 6 by bugdroid1@chromium.org, Nov 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82

commit a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82
Author: Benoit Lize <lizeb@chromium.org>
Date: Tue Nov 07 15:12:57 2017

android: Support lightweight instrumentation in orderfile_generator_backend.py.

This adds a --lightweight-instrumentation flag to orderfile_generator.py
to use the new pipeline. The new step is using process_profiles.py to
output a list of reached offsets symbols, matching the output format
from the current instrumentation.

This allows the rest of the pipeline to remain unchanged.

Bug:  776702 
Change-Id: Iddf8d42112f1d44a7ef36d26328f51d0b3b894cb
Reviewed-on: https://chromium-review.googlesource.com/746665
Commit-Queue: Benoit L <lizeb@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Cr-Commit-Position: refs/heads/master@{#514470}
[modify] https://crrev.com/a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82/tools/android/native_lib_memory/extract_symbols.py
[modify] https://crrev.com/a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82/tools/cygprofile/cyglog_to_orderfile.py
[modify] https://crrev.com/a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82/tools/cygprofile/lightweight_cygprofile.cc
[modify] https://crrev.com/a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82/tools/cygprofile/orderfile_generator_backend.py
[modify] https://crrev.com/a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82/tools/cygprofile/process_profiles.py
[modify] https://crrev.com/a87e5bcebf6bc0b3a0a8fa3adb692f7125d72f82/tools/cygprofile/profile_android_startup.py

Project Member

Comment 7 by bugdroid1@chromium.org, Nov 17 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8330d7f12d5d5d34307b5a100a33a6b4844ccdc3

commit 8330d7f12d5d5d34307b5a100a33a6b4844ccdc3
Author: Benoit Lize <lizeb@chromium.org>
Date: Fri Nov 17 16:25:17 2017

android: Don't handcode CAS in cygprofile.

Cygprofile code does a compare and swap on an atomic value. Use the STL
atomic_fetch_or_explicit() instead of hand-coding a CAS.

Bug:  776702 
Change-Id: I049dbbb5f67255b6556d6beae37f243030561079
Reviewed-on: https://chromium-review.googlesource.com/776682
Reviewed-by: Egor Pasko <pasko@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#517421}
[modify] https://crrev.com/8330d7f12d5d5d34307b5a100a33a6b4844ccdc3/tools/cygprofile/lightweight_cygprofile.cc

Project Member

Comment 8 by bugdroid1@chromium.org, Nov 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/328af70306c1e9a992bda8b4cd3af88b03c1b9df

commit 328af70306c1e9a992bda8b4cd3af88b03c1b9df
Author: Benoit Lize <lizeb@chromium.org>
Date: Mon Nov 27 10:43:54 2017

android: Record ordering in lightweight_cygprofile.cc.

lightweight_cygprofile.cc doesn't take ordering into account, as it uses
a simple bitfield to record which locations have been reached in the
executable code. This is detrimental to startup performance, so this CL
adds ordering to it.

Changes:
- Adds an ordered list of reached locations "behind" the bitfield. When
  a new element is inserted in the bitfield, add it to the list of reached locations.
- Updates the processing to keep ordering, and deal with the new output
  format.

This increases memory usage by ~4MB per process, as the reached location
list is preallocated, but only for profiling builds. This would
drastically shrink once post-inlining instrumentation is available in
clang, as Chrome has <<1M functions reached per process.

Bug:  776702 
Change-Id: Ib6f0fd2fd83d91167073453bac47f79050249248
Reviewed-on: https://chromium-review.googlesource.com/782220
Reviewed-by: Egor Pasko <pasko@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#519269}
[modify] https://crrev.com/328af70306c1e9a992bda8b4cd3af88b03c1b9df/tools/cygprofile/lightweight_cygprofile.cc
[modify] https://crrev.com/328af70306c1e9a992bda8b4cd3af88b03c1b9df/tools/cygprofile/process_profiles.py
[modify] https://crrev.com/328af70306c1e9a992bda8b4cd3af88b03c1b9df/tools/cygprofile/process_profiles_unittest.py

Project Member

Comment 9 by bugdroid1@chromium.org, Nov 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/417487eaa0c0937760b2ccb33c964f1317b346d6

commit 417487eaa0c0937760b2ccb33c964f1317b346d6
Author: Benoit Lize <lizeb@chromium.org>
Date: Mon Nov 27 16:03:20 2017

tools/cygprofile: Add unit and performance tests to lightweight_cygprofile.

This adds unit tests to lightweight_cygprofile, and performance tests,
similar to the ones for cygprofile.cc. Using the same tests as the ones
in 72aed26f38a0f5a7be558eb44ba06d7739d77e72 on the same device (Nexus
5X):

cygprofile.cc (from 72aed26f38a0f5a7be558eb44ba06d7739d77e72):
[ RUN      ] CygprofilePerfTest.CreateEntries_10_10000
*RESULT AddEntryCostPerCall: = 638.81 ns
[ RUN      ] CygprofilePerfTest.CreateEntries_100_10000
*RESULT AddEntryCostPerCall: = 242.751 ns
[ RUN      ] CygprofilePerfTest.CreateEntries_10_100000
*RESULT AddEntryCostPerCall: = 694.71 ns
[ RUN      ] CygprofilePerfTest.CreateEntries_100_1000000
*RESULT AddEntryCostPerCall: = 558.0389 ns

lightweight_cygprofile.cc:
[----------] 8 tests from CygprofilePerfTest
[ RUN      ] CygprofilePerfTest.RecordAddress_10_10000
*RESULT AddEntryCostPerCall_10_10000_1: = 5.9 ns
[       OK ] CygprofilePerfTest.RecordAddress_10_10000 (0 ms)
[ RUN      ] CygprofilePerfTest.RecordAddress_100_10000
*RESULT AddEntryCostPerCall_100_10000_1: = 4.746 ns
[       OK ] CygprofilePerfTest.RecordAddress_100_10000 (5 ms)
[ RUN      ] CygprofilePerfTest.RecordAddress_10_100000
*RESULT AddEntryCostPerCall_10_100000_1: = 5.975 ns
[       OK ] CygprofilePerfTest.RecordAddress_10_100000 (6 ms)
[ RUN      ] CygprofilePerfTest.RecordAddress_100_100000
*RESULT AddEntryCostPerCall_100_100000_1: = 5.0782 ns
[       OK ] CygprofilePerfTest.RecordAddress_100_100000 (51 ms)

Increasing the number of threads from 1 to 6:
[ RUN      ] CygprofilePerfTest.RecordAddress_1000_100000_2
*RESULT AddEntryCostPerCall_1000_100000_2: = 2.45617 ns
[       OK ] CygprofilePerfTest.RecordAddress_1000_100000_2 (246 ms)
[ RUN      ] CygprofilePerfTest.RecordAddress_1000_100000_3
*RESULT AddEntryCostPerCall_1000_100000_3: = 2.39735 ns
[       OK ] CygprofilePerfTest.RecordAddress_1000_100000_3 (240 ms)
[ RUN      ] CygprofilePerfTest.RecordAddress_1000_100000_4
*RESULT AddEntryCostPerCall_1000_100000_4: = 2.3979 ns
[       OK ] CygprofilePerfTest.RecordAddress_1000_100000_4 (240 ms)
[ RUN      ] CygprofilePerfTest.RecordAddress_1000_100000_6
*RESULT AddEntryCostPerCall_1000_100000_6: = 3.88468 ns
[       OK ] CygprofilePerfTest.RecordAddress_1000_100000_6 (390 ms)
[----------] 8 tests from CygprofilePerfTest (1181 ms total)

Overall, lightweight_cygprofile.cc is ~100x faster, even with
contention (~600ns to ~5ns per call).

Bug:  776702 
Change-Id: I31fa5844a625443b252c18eebd4799be6d045141
Reviewed-on: https://chromium-review.googlesource.com/788953
Reviewed-by: agrieve <agrieve@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#519311}
[modify] https://crrev.com/417487eaa0c0937760b2ccb33c964f1317b346d6/base/android/library_loader/anchor_functions.h
[modify] https://crrev.com/417487eaa0c0937760b2ccb33c964f1317b346d6/tools/cygprofile/BUILD.gn
[add] https://crrev.com/417487eaa0c0937760b2ccb33c964f1317b346d6/tools/cygprofile/delayed_dumper.cc
[modify] https://crrev.com/417487eaa0c0937760b2ccb33c964f1317b346d6/tools/cygprofile/lightweight_cygprofile.cc
[add] https://crrev.com/417487eaa0c0937760b2ccb33c964f1317b346d6/tools/cygprofile/lightweight_cygprofile.h
[add] https://crrev.com/417487eaa0c0937760b2ccb33c964f1317b346d6/tools/cygprofile/lightweight_cygprofile_perftest.cc

Project Member

Comment 10 by bugdroid1@chromium.org, Nov 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e24c2fd0c9054d0d971978a981278c64f0c4a086

commit e24c2fd0c9054d0d971978a981278c64f0c4a086
Author: Benoit Lize <lizeb@chromium.org>
Date: Wed Nov 29 09:57:28 2017

tools/cygprofile: Order dumps by timestamp.

Add the timestamp to the filenames, and process them in increasing process
creation time order. This changes the ordering so that symbols accessed by the
browser process are at the beginning.

Bug:  776702 
Change-Id: I5e6b9eb61364ed194186900afa64fdd459e65fbe
Reviewed-on: https://chromium-review.googlesource.com/790991
Commit-Queue: Benoit L <lizeb@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#520057}
[modify] https://crrev.com/e24c2fd0c9054d0d971978a981278c64f0c4a086/tools/cygprofile/delayed_dumper.cc
[modify] https://crrev.com/e24c2fd0c9054d0d971978a981278c64f0c4a086/tools/cygprofile/process_profiles.py
[modify] https://crrev.com/e24c2fd0c9054d0d971978a981278c64f0c4a086/tools/cygprofile/process_profiles_unittest.py

Project Member

Comment 11 by bugdroid1@chromium.org, Nov 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/073963a73a7dacee64c61a1410c94e4662d554fc

commit 073963a73a7dacee64c61a1410c94e4662d554fc
Author: Benoît Lizé <lizeb@chromium.org>
Date: Thu Nov 30 18:58:55 2017

android: use "after inlining" function instrumentation.

85023f0253e73844e0bdf00f134ddbeeae057aed rolled clang past r318199
adding an option to insert instrumentation calls after inlining. This
drastically decreases the instrumented build size:

before:
-rwxr-x--- 1 lizeb eng 257M Nov 29 14:20 libchrome.so*

after:
-rwxr-x--- 1 lizeb eng 67M Nov 29 14:49 libchrome.so*

This should reduce compilation time on the orderfile bot.

Bug:  776702 
Change-Id: I37549e5f6611286ff09238d428b05458f8f2397c
Reviewed-on: https://chromium-review.googlesource.com/796478
Reviewed-by: agrieve <agrieve@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Commit-Queue: Egor Pasko <pasko@chromium.org>
Cr-Commit-Position: refs/heads/master@{#520626}
[modify] https://crrev.com/073963a73a7dacee64c61a1410c94e4662d554fc/build/config/android/BUILD.gn

Project Member

Comment 12 by bugdroid1@chromium.org, Dec 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/96236669dbf109020e839ec68f0be2b0223d8855

commit 96236669dbf109020e839ec68f0be2b0223d8855
Author: Benoit Lize <lizeb@chromium.org>
Date: Fri Dec 08 17:49:14 2017

tools/cygprofile: Tool to compare orderfiles.

This allows to see whether a given orderfile roll significantly changed
the orderfile, and also to check orderfile stability (as a set of symbols).

For instance, a typical roll would be:
$ python tools/cygprofile/compare_orderfiles.py --from-commit
bb04f2f86c5f1880d193d06279e23fb700bd6fb8
[...]
Symbols count:
        first:  190265
        second: 190276
New symbols = 33
Removed symbols = 22

Whereas for a more significant one (see crbug.com/792828):
$ python tools/cygprofile/compare_orderfiles.py --from-commit
4c1e7a85c8bac01731203990fd4b66a7b5b19dd9
[...]
Symbols count:
        first:  428730
        second: 190074
New symbols = 16952
Removed symbols = 255608

Bug: 792828,758566, 776702 
Change-Id: I9a3676874218ba651670136774214184c616a792
Reviewed-on: https://chromium-review.googlesource.com/817555
Reviewed-by: Egor Pasko <pasko@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#522815}
[add] https://crrev.com/96236669dbf109020e839ec68f0be2b0223d8855/tools/cygprofile/compare_orderfiles.py

Project Member

Comment 13 by bugdroid1@chromium.org, Dec 14 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1c99c3dbef26ab8b853524b889bfd5dec815a2dc

commit 1c99c3dbef26ab8b853524b889bfd5dec815a2dc
Author: Benoit Lize <lizeb@chromium.org>
Date: Thu Dec 14 18:40:21 2017

tools/cygprofile: use compact (bare) instrumentation.

https://chromium-review.googlesource.com/c/chromium/src/+/823219 rolled
clang past the revision including -finstrument-function-entry-bare. This
reduces the instrumentation size and runtime overhead.

Use this new instrumentation when possible.

Size of libchrome.so:
Before: 67MB
After: 55MB
No Instrumentation: 47MB

Bug:  776702 
Change-Id: I74bea84e7e6b0fa98c841ce0f2c492b6a7a7e31f
Reviewed-on: https://chromium-review.googlesource.com/826623
Reviewed-by: agrieve <agrieve@chromium.org>
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#524124}
[modify] https://crrev.com/1c99c3dbef26ab8b853524b889bfd5dec815a2dc/build/config/android/BUILD.gn
[modify] https://crrev.com/1c99c3dbef26ab8b853524b889bfd5dec815a2dc/tools/cygprofile/lightweight_cygprofile.cc

Project Member

Comment 14 by bugdroid1@chromium.org, Jan 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/69ba2b2764cac402ad10ec4d35094b1f12f02c25

commit 69ba2b2764cac402ad10ec4d35094b1f12f02c25
Author: Benoit Lize <lizeb@chromium.org>
Date: Thu Jan 18 17:40:18 2018

tools/cygprofile: Enable lightweight instrumentation by default.

Bug:  776702 
Change-Id: I417ce9ebf6f5117b5062531d8d57b65a64a4d69b
Reviewed-on: https://chromium-review.googlesource.com/873640
Reviewed-by: Matthew Cary <mattcary@chromium.org>
Reviewed-by: Egor Pasko <pasko@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#530186}
[modify] https://crrev.com/69ba2b2764cac402ad10ec4d35094b1f12f02c25/tools/cygprofile/orderfile_generator_backend.py

Comment 15 by lizeb@chromium.org, Mar 14 2018

Status: Fixed (was: Started)

Sign in to add a comment