Reduce noise on startup.mobile benchmark |
||||||||||
Issue descriptionThe noise (stdev out of 10-15 startups) just on the messageloop_start_time can be between 30ms (good, want this) to 150ms (bad, unfortunately). Other observations so far, no robust study to summarize yet (sorry): * Nexus 5X is a really difficult device for benchmarking (in addition to using the thermally-challenged A57 cores, it had bad design at the device scale (with a heatpipe directing all the heat to the fingerprint sensor) * [N5X] After >10 startup iterations, the metrics report slower and slower start (time is influencing factor for the metrics) * [N5X] We already mitigate it by cooling down the device and lowering CPU/GPU frequency on the device, there is still upwards trend in the timeline
,
Sep 6
I made a somewhat orthogonal mini-study today to determine whether the non-ideal pagecache clearing done by possible_browser.FlushOsPageCaches() affects our metrics. The answer is: yes, we would have lower noise if we make a more accurate pagecache flushing. More details in this NOT_FOR_COMMIT: https://chromium-review.googlesource.com/c/chromium/src/+/1210643
,
Sep 6
,
Sep 10
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b8534089aad3914d68c3e8e725b13c5466a591e5 commit b8534089aad3914d68c3e8e725b13c5466a591e5 Author: Egor Pasko <pasko@chromium.org> Date: Mon Sep 10 15:28:10 2018 LibraryLoader: Add residency percentage to trace This should help triage cases when library prefetching takes too long or too little. Bug: 881384 Change-Id: I315020758bb702578237fae6de5c776e27c367d4 Reviewed-on: https://chromium-review.googlesource.com/1215802 Reviewed-by: Benoit L <lizeb@chromium.org> Reviewed-by: agrieve <agrieve@chromium.org> Commit-Queue: Egor Pasko <pasko@chromium.org> Cr-Commit-Position: refs/heads/master@{#589920} [modify] https://crrev.com/b8534089aad3914d68c3e8e725b13c5466a591e5/base/android/java/src/org/chromium/base/TraceEvent.java [modify] https://crrev.com/b8534089aad3914d68c3e8e725b13c5466a591e5/base/android/java/src/org/chromium/base/library_loader/LibraryLoader.java
,
Sep 10
Bots started producing data: https://chromeperf.appspot.com/report?sid=62b401381da7a399cd6ce4b342b11a2194833036acb7e64253a1719afc4351da Between cycles the noise varies a lot for n5x, will ask to cool down more often
,
Sep 10
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4407996779cf2d9f4e6146fb560fc98dce37e566 commit 4407996779cf2d9f4e6146fb560fc98dce37e566 Author: Egor Pasko <pasko@chromium.org> Date: Mon Sep 10 20:24:40 2018 startup.mobile: stricter battery temperature requirements SHERIFFS: This could make some benchmarks run faster on continuous bots because the devices gets cooled down. The benchmark touched by this change is _not_ monitored, but the benchmarks running _after_ this one may see a change. This is expected. Traces from experiment [1] showed that waiting for battery to cool down to 32C happens every 7 runs adds adds a couple of minutes of waiting and does not remove the upwards trend on metrics entirely. During those cool runs the noise was significantly lower than on bots, using the same hardware (N5X). Trying if this value makes bots bappier. Cooling down seems unnecessary on N5 and on Go devices, but it would be good to keep an eye on them anyway and see later how to properly make this device-dependent in the future, if there is need. [1] NOT_FOR_COMMIT: Sleep before/after pagecache flush https://chromium-review.googlesource.com/c/chromium/src/+/1210643 Bug: 881384 Change-Id: I304437b4231503d7a5d461ee00190bdb8ee5f5cd Reviewed-on: https://chromium-review.googlesource.com/1217035 Commit-Queue: Ned Nguyen <nednguyen@google.com> Reviewed-by: Ned Nguyen <nednguyen@google.com> Cr-Commit-Position: refs/heads/master@{#590039} [modify] https://crrev.com/4407996779cf2d9f4e6146fb560fc98dce37e566/tools/perf/benchmarks/startup_mobile.py
,
Sep 13
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/cf71ed1bbf589b0d330519b591f26e96a0d58804 commit cf71ed1bbf589b0d330519b591f26e96a0d58804 Author: Egor Pasko <pasko@chromium.org> Date: Thu Sep 13 15:23:44 2018 clear_system_cache: update binaries This should incorporate the commit http://crrev.com/590266 into the binaries. The commit description says: "clear_system_cache: Sync dirty pages before ClearCacheForFile(s)" The chromium tree is checked out at: chromium@9b5309d1e2ef08e0755efb879029c11be8b65f68 Note: OSX binaries would also be nice to update, but I don't have convenient machines to do so. The change is no-op for Windows - no need to update. I always forget the steps to update the binaries, listing them here for posterity: ================== begin steps to refresh the binaries ================== // In src/: shell> cat gn_linux/Release/args.gn target_os = "linux" is_debug = false is_component_build = false enable_nacl = false use_goma = true shell> autoninja -C gn_linux/Release clear_system_cache [...] shell> cat gn_android/Release/args.gn target_os = "android" target_cpu = "arm" use_goma = true is_debug = false is_component_build = false shell> autoninja -C gn_android/Release/ clear_system_cache [...] shell> cat gn_android_arm64/Release/args.gn target_os = "android" target_cpu = "arm64" use_goma = true is_debug = false is_component_build = false shell> autoninja -C gn_android_arm64/Release/ clear_system_cache [...] // In src/third_party/catapult/: shell> telemetry/bin/update_telemetry_dependency \ ../../gn_linux/Release/clear_system_cache \ clear_system_cache linux_x86_64 shell> telemetry/bin/update_telemetry_dependency ../../gn_android/Release/clear_system_cache \ clear_system_cache android_armeabi-v7a shell> telemetry/bin/update_telemetry_dependency \ ../../gn_android_arm64/Release/clear_system_cache \ clear_system_cache android_arm64-v8a ==================== end steps to refresh the binaries ================== Bug: chromium:881384 Change-Id: I9e6f51a07d259c6cf3cc62950c88c46948ccfb0c Reviewed-on: https://chromium-review.googlesource.com/1224513 Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> Commit-Queue: Egor Pasko <pasko@chromium.org> [modify] https://crrev.com/cf71ed1bbf589b0d330519b591f26e96a0d58804/telemetry/telemetry/internal/binary_dependencies.json
,
Sep 13
now the downstream bots started uploading data: https://chromeperf.appspot.com/report?sid=fb2fd44a0c3cb14cd0f969027f84a0a50ea696f57c57e5f3dd0dc990b78e89c8 health-plan-clankium-phone (N5): * GOOD! * asyncPrefetchLibrariesToMemory takes 400-500ms (wow something realistic finally) * coldish_stdev: 32ms * warm_stdev: 50ms perf-monochrome-n-phone (N5X): * BROKEN! coldish (100% of the code is resident when starting - need to investigate) * messageloop_start_time_avg: 530-580ms * coldish_stdev: 93ms, warm_stdev: 7-8ms health-plan-clankium-low-end-phone (sprout): * NOISY! * warm_stdev: 212ms * coldish_stdev: 319ms perf-go-phone-1024 (gobo): * BROKEN! coldish * coldish_stdev: 52ms * warm_stdev: 46ms perf-go-phone-512 (gobo): * NOISY! * asyncPrefetchLibrariesToMemory residency is between 45% and 90% * coldish_stdev: 274ms * warm_stdev: 106ms
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7b2b2dca23cca0862f674758c9a3933e685c27d5 commit 7b2b2dca23cca0862f674758c9a3933e685c27d5 Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Date: Thu Sep 13 18:48:20 2018 Roll src/third_party/catapult 994bb1389b1d..5f6da8a57db1 (3 commits) https://chromium.googlesource.com/catapult.git/+log/994bb1389b1d..5f6da8a57db1 git log 994bb1389b1d..5f6da8a57db1 --date=short --no-merges --format='%ad %ae %s' 2018-09-13 chiniforooshan@chromium.org Telemetry: fix a SF stats collector bug 2018-09-13 cbruni@chromium.org Include hash for wprgo archive names to reduce name collisions 2018-09-13 pasko@chromium.org clear_system_cache: update binaries Created with: gclient setdep -r src/third_party/catapult@5f6da8a57db1 The AutoRoll server is located here: https://autoroll.skia.org/r/catapult-autoroll Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG= chromium:881873 ,chromium:878390, chromium:881384 TBR=sullivan@chromium.org Change-Id: Ic86872f9c2fef3f27bfc693e9927e56235561f4b Reviewed-on: https://chromium-review.googlesource.com/1224596 Reviewed-by: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#591088} [modify] https://crrev.com/7b2b2dca23cca0862f674758c9a3933e685c27d5/DEPS
,
Sep 14
Thanks for the investigation in #9, and keeping up the work to fix these. +ushesh FYI - as these will become the new startup numbers to report in system health.
,
Sep 14
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/9f36d9f788130761c3611060bc7a104772761d15 commit 9f36d9f788130761c3611060bc7a104772761d15 Author: Egor Pasko <pasko@chromium.org> Date: Fri Sep 14 13:24:41 2018 FlushOsPageCaches: wait for dust to settle In [1] we found extra evidence for OS pagecache not being fully complete after possible_browser.FlushOsPageCaches() returns. A best-effort mitigation is in flight: [2]. This is another one, which is done at a level a bit higher to avoid pausing after each invocation of the clear_system_cache tool. [1] NOT_FOR_COMMIT: Sleep before/after pagecache flush https://chromium-review.googlesource.com/c/chromium/src/+/1210643 [2] clear_system_cache: Sync dirty pages before ClearCacheForFile(s) https://chromium-review.googlesource.com/c/chromium/src/+/1211602 Bug: chromium:811244 Bug: chromium:881384 Change-Id: I46cd699cdbd163fa7ca2470d8cbd580683dea2ad Reviewed-on: https://chromium-review.googlesource.com/1216346 Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> Commit-Queue: Egor Pasko <pasko@chromium.org> [modify] https://crrev.com/9f36d9f788130761c3611060bc7a104772761d15/telemetry/telemetry/core/platform.py [modify] https://crrev.com/9f36d9f788130761c3611060bc7a104772761d15/telemetry/telemetry/internal/backends/chrome/chrome_browser_backend.py [modify] https://crrev.com/9f36d9f788130761c3611060bc7a104772761d15/telemetry/telemetry/internal/browser/possible_browser.py
,
Sep 14
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8323f92cb0c81332c00abfe5a3bad012a400fd35 commit 8323f92cb0c81332c00abfe5a3bad012a400fd35 Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Date: Fri Sep 14 16:52:21 2018 Roll src/third_party/catapult 3e071665b9f9..9f36d9f78813 (14 commits) https://chromium.googlesource.com/catapult.git/+log/3e071665b9f9..9f36d9f78813 git log 3e071665b9f9..9f36d9f78813 --date=short --no-merges --format='%ad %ae %s' 2018-09-14 pasko@chromium.org FlushOsPageCaches: wait for dust to settle 2018-09-14 ulan@chromium.org Revert "Output DevTools error messages as warnings while running a story." 2018-09-13 simonhatch@chromium.org Dashboard - Error out on empty or uncompressed uploads 2018-09-13 sadrul@chromium.org rendering: Ignore trace-events for canceled draws. 2018-09-13 vovoy@chromium.org Only download files for filtered stories in story_runner 2018-09-13 chiniforooshan@chromium.org Telemetry: fix a SF stats collector bug 2018-09-13 cbruni@chromium.org Include hash for wprgo archive names to reduce name collisions 2018-09-13 pasko@chromium.org clear_system_cache: update binaries 2018-09-13 anthonyalridge@google.com Create a link to traces from the CFG. 2018-09-13 anthonyalridge@google.com Add clip path at to prevent overlap with x axis label. 2018-09-13 wangge@google.com Add functions to include APK, generate isolate and upload it. 2018-09-13 cbruni@chromium.org Add more helpers in preparation for v8.loading.cluster_telemetry benchmark 2018-09-13 anthonyalridge@google.com Add padding to numeric labels for stacked bar plotter. 2018-09-13 ulan@chromium.org Output DevTools error messages as warnings while running a story. Created with: gclient setdep -r src/third_party/catapult@9f36d9f78813 The AutoRoll server is located here: https://autoroll.skia.org/r/catapult-autoroll Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG=chromium:811244, chromium:881384 , chromium:883892 ,chromium:880432, chromium:883735 ,chromium:883592, chromium:882291 , chromium:881873 ,chromium:878390, chromium:881384 ,chromium:866423,chromium:866423, chromium:863390 , chromium:883322 ,chromium:866423,chromium:880432 TBR=sullivan@chromium.org Change-Id: I934e7e9b842518c75edbeb4a1604c2e06b577861 Reviewed-on: https://chromium-review.googlesource.com/1226380 Reviewed-by: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#591375} [modify] https://crrev.com/8323f92cb0c81332c00abfe5a3bad012a400fd35/DEPS
,
Sep 17
in #9 I looked at noise in latest uploaded points. Here are the timelines: https://chromeperf.appspot.com/report?sid=39315879ca04233d2896f6d5ba68ce41c6bd9c4e7ce2cd849f72342d2881a646 health-plan-clankium-phone (N5): * apparently it it not as good as I painted in #9: there are huge stdev spikes, those are something new perf-monochrome-n-phone (N5X): * besides being BROKEN, noise on 'coldish' did not improve perf-go-phone-1024 (gobo): * stdev went down by 18ms (or it did not), actually not too bad (35-50ms) * coldish value went up (probably seeing results of better pagecache eviction) perf-go-phone-512 (gobo): * nothing changed, still crazy noisy and not useful I'll take a closed look at: 1. those jumps on N5 (maybe systrace will give a hint?) 2. brokenness with Monochrome/N5X, this may be related to noise (that I am not seeing locally)
,
Sep 28
,
Oct 1
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/ad0631532d0cb16d2a72465147a5a8142ad0919a commit ad0631532d0cb16d2a72465147a5a8142ad0919a Author: Egor Pasko <pasko@chromium.org> Date: Mon Oct 01 15:17:05 2018 Avoid running Android background jobs **PERF SHERIFFS**: this should move metrics on Android. Likely startup, but there could be something else affected, like loading. Use the newly introduced flag in http://crrev.com/595188 to prevent spurious jobs from being created when benchmarking. Those jobs do little, but lead to Chrome process being cached, which artificially inflates the metrics that are based on process start. This should cut down those occasional stdev of 2 seconds. The bugs provide more context. Bug: chromium:890424 Bug: chromium:881384 Change-Id: Ifb599bff0e55041495aa87983262dfd0940df226 Reviewed-on: https://chromium-review.googlesource.com/1254101 Commit-Queue: Egor Pasko <pasko@chromium.org> Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> [modify] https://crrev.com/ad0631532d0cb16d2a72465147a5a8142ad0919a/telemetry/telemetry/internal/backends/chrome/chrome_startup_args.py
,
Oct 1
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/fc8fe0364d1fa8c20e308567a0d1a4340d27e046 commit fc8fe0364d1fa8c20e308567a0d1a4340d27e046 Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Date: Mon Oct 01 22:31:14 2018 Roll src/third_party/catapult ac93684421fa..69f64b270397 (4 commits) https://chromium.googlesource.com/catapult.git/+log/ac93684421fa..69f64b270397 git log ac93684421fa..69f64b270397 --date=short --no-merges --format='%ad %ae %s' 2018-10-01 oysteine@google.com Allow bindId for separate begin/end slices as well 2018-10-01 simonhatch@chromium.org Dashboard - Create histograms per bot for alert statistics 2018-10-01 cbruni@chromium.org [pinpoint] Redirect to raw results.html on job completion 2018-10-01 pasko@chromium.org Avoid running Android background jobs Created with: gclient setdep -r src/third_party/catapult@69f64b270397 The AutoRoll server is located here: https://autoroll.skia.org/r/catapult-autoroll Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG= chromium:890025 ,chromium:876233, chromium:890424 , chromium:881384 TBR=sullivan@chromium.org Change-Id: I3fa5e360f1c7c98152f5078c58d9d41e34b66ebc Reviewed-on: https://chromium-review.googlesource.com/1254685 Reviewed-by: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#595601} [modify] https://crrev.com/fc8fe0364d1fa8c20e308567a0d1a4340d27e046/DEPS
,
Oct 3
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/bfe2c00467df0479eb6e87f63015d2a680a8b5d1 commit bfe2c00467df0479eb6e87f63015d2a680a8b5d1 Author: Egor Pasko <pasko@chromium.org> Date: Wed Oct 03 17:13:19 2018 androidStartupMetric: skip the first start The first browser start has different behavior for both warm and coldish starts because of less-controlled factors arriving from installation of the APK, more system idle time prior to the first run, etc. Analysing these first starts is more complex because of the said conflating factors, and also this scenario is less common in the field. Ignore it for now. Bug: chromium:881384 Change-Id: I209a2e9eab1ba283837d16e17a5be932fb8bf33b Reviewed-on: https://chromium-review.googlesource.com/1256882 Reviewed-by: Ben Hayden <benjhayden@chromium.org> Commit-Queue: Egor Pasko <pasko@chromium.org> [modify] https://crrev.com/bfe2c00467df0479eb6e87f63015d2a680a8b5d1/tracing/tracing/metrics/android_startup_metric_test.html [modify] https://crrev.com/bfe2c00467df0479eb6e87f63015d2a680a8b5d1/tracing/tracing/metrics/android_startup_metric.html
,
Oct 3
,
Oct 4
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8fbe05fb32cf430baa5ff7f13704f6bee7abeab4 commit 8fbe05fb32cf430baa5ff7f13704f6bee7abeab4 Author: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Date: Thu Oct 04 03:52:41 2018 Roll src/third_party/catapult 2dd914402ebc..0543f082e2f4 (7 commits) https://chromium.googlesource.com/catapult.git/+log/2dd914402ebc..0543f082e2f4 git log 2dd914402ebc..0543f082e2f4 --date=short --no-merges --format='%ad %ae %s' 2018-10-03 khmel@chromium.org Fix flakiness of tests, that require GAIA login. 2018-10-03 kylechar@chromium.org Support macOS 10.14 Mojave. 2018-10-03 zmo@chromium.org Revert "Wire action_runner's Click through DevTools" 2018-10-03 chiniforooshan@chromium.org Telemetry: clean up legacy surface flinger metrics 2018-10-03 pasko@chromium.org androidStartupMetric: skip the first start 2018-10-03 benjhayden@chromium.org Fix flaky cp-toast.autoclose-only-last test. 2018-10-03 zmo@chromium.org Wire action_runner's Click through DevTools Created with: gclient setdep -r src/third_party/catapult@0543f082e2f4 The AutoRoll server is located here: https://autoroll.skia.org/r/catapult-autoroll Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel BUG= chromium:879353 , chromium:890951 ,chromium:885912, chromium:890757 , chromium:881384 ,chromium:885912 TBR=sullivan@chromium.org Change-Id: I6e2ae7a29df4dcab18c0cde1e852ee159a3f12cf Reviewed-on: https://chromium-review.googlesource.com/c/1260146 Reviewed-by: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Commit-Queue: chromium-autoroll <chromium-autoroll@skia-public.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#596495} [modify] https://crrev.com/8fbe05fb32cf430baa5ff7f13704f6bee7abeab4/DEPS
,
Nov 20
,
Nov 20
,
Jan 16
(6 days ago)
,
Jan 16
(6 days ago)
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by nednguyen@chromium.org
, Sep 6Components: Speed>Telemetry