New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 814208 link

Starred by 2 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

6.6%-10.2% regression in system_health.webview_startup at 537095:537242

Project Member Reported by alexclarke@chromium.org, Feb 21 2018

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Feb 21 2018

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=814208

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=680363c0dc04f4f1b305d27bbd3f3f4f2d271a822b7e049b4ffe30d43d492637


Bot(s) for this bug's original alert(s):

android-webview-nexus5X
android-webview-nexus6
😿 Pinpoint job stopped with an error.
https://pinpoint-dot-chromeperf.appspot.com/job/16cc175c440000
Last pinpoint has failed so I kicked another one with a narrower range (537155-537184) based on its partial result.
Project Member

Comment 7 by 42576172...@developer.gserviceaccount.com, Mar 23 2018

📍 Couldn't reproduce a difference.
https://pinpoint-dot-chromeperf.appspot.com/job/159487ce440000
Trying a few more bisects to see if we can repro.
Cc: kbr@chromium.org tetsui@chromium.org adenilso...@arm.com garykac@chromium.org tandrii@chromium.org nedngu...@google.com agrieve@chromium.org dcheng@chromium.org bsheedy@chromium.org d...@chromium.org yzshen@chromium.org m...@chromium.org eirage@chromium.org dtapu...@chromium.org dtrainor@chromium.org wittman@chromium.org dpranke@chromium.org steve...@chromium.org cblume@chromium.org elawrence@chromium.org sky@chromium.org torne@google.com qin...@chromium.org tiborg@chromium.org mdjones@chromium.org asvitk...@chromium.org noel@chromium.org
Owner: yzshen@chromium.org
Status: Assigned (was: Untriaged)
📍 Found significant differences after each of 11 commits.
https://pinpoint-dot-chromeperf.appspot.com/job/16cc175c440000

Revert "[test_env.py] Warm up vpython virtualenv cache on swarming task shards." by kbr@chromium.org
https://chromium.googlesource.com/chromium/src/+/7dde857919af2f59fcab10264d696301a15c64da

Truncate the length of downloaded file name on windows by qinmin@chromium.org
https://chromium.googlesource.com/chromium/src/+/c575e74530267932837e074fa4cdb3df4069d497

Add xr.webvr.live.static benchmark by bsheedy@chromium.org
https://chromium.googlesource.com/chromium/src/+/a8698ddb942e380603beb1b9b15e365d5ce18d27

Fix JNI generator handling of generics. by torne@google.com
https://chromium.googlesource.com/chromium/src/+/1257be13887e153bc570c0a3de05e743c651c8a2

deprecate createTouch and createTouchList by eirage@chromium.org
https://chromium.googlesource.com/chromium/src/+/b36cc2b3601062d644b4a861b84e61d60744e335

Revert "Second reland of "Encapsulate Chrome thread profiling setup into a single class"" by mdjones@chromium.org
https://chromium.googlesource.com/chromium/src/+/92ff5ccfaf16d5e5b662be28e5d655ddccef921e

Add UseCounter for Async Clipboard API by garykac@chromium.org
https://chromium.googlesource.com/chromium/src/+/8e5f0d8f579f9d32d72748630a55725e5835a1d7

Remove TrayPopupInkDropStyle from SystemMenuButton by tetsui@chromium.org
https://chromium.googlesource.com/chromium/src/+/9a0978eba8913bac53a82a4684c6e115c2680bbc

Reland "Move grayscale conversion into NavigationEntryScreenshotManager." by miu@chromium.org
https://chromium.googlesource.com/chromium/src/+/bca2fc37330b1fbf8b9f1f54dc899c5c9eaed987

Compute crc32 using ARMv8 specific instruction by adenilson.cavalcanti@arm.com
https://chromium.googlesource.com/chromium/src/+/28c9623083688b3a354c33bf77746f4c51f58826

Pin loading.desktop.network_service perf tests to a specific bot. by yzshen@chromium.org
https://chromium.googlesource.com/chromium/src/+/9be01562892ccbbbe1745e8b9553dc24b3c70625

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions
Cc: -tandrii@chromium.org
Cc: -mdjones@chromium.org -wittman@chromium.org
Unlikely to be "Revert "Second reland of "Encapsulate Chrome thread profiling setup into a single class"" because no effects were seen on previous land.
Cc: droger@chromium.org roc...@chromium.org rouslan@chromium.org mlamouri@chromium.org jam@chromium.org steimel@chromium.org thakis@chromium.org nasko@chromium.org xiaoche...@chromium.org tikuta@google.com
Owner: steimel@chromium.org
📍 Found significant differences after each of 4 commits.
https://pinpoint-dot-chromeperf.appspot.com/job/1347cf5b440000

Wider usage of strip_absolute_paths_from_debug_symbols by tikuta@google.com
https://chromium.googlesource.com/chromium/src/+/01f7e38c05caa21e2277a6acad779f995c6ac89b

Use a dummy LocalInterfaceProvider for spellcheck test code by xiaochengh@chromium.org
https://chromium.googlesource.com/chromium/src/+/3125666425d0bee78d5172da57351d89a89b7d0b

Remove dead code after r527760. by jam@chromium.org
https://chromium.googlesource.com/chromium/src/+/33ccb80793bc1f632c7929fe6a1f89285156bd06

Show audio controls when video tag has only audio track by steimel@chromium.org
https://chromium.googlesource.com/chromium/src/+/c54ab7cbb7e628b3c48452b6c4c8c931f7d02871

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions

Comment 15 by m...@chromium.org, Apr 4 2018

Cc: -m...@chromium.org
Cc: tkent@chromium.org pdfium-c...@skia-buildbots.google.com.iam.gserviceaccount.com jianli@chromium.org
Owner: jianli@chromium.org
📍 Found significant differences after each of 2 commits.
https://pinpoint-dot-chromeperf.appspot.com/job/16dc9527440000

Roll src/third_party/pdfium/ 46f79aaad..625e6fec9 (1 commit) by pdfium-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com
https://chromium.googlesource.com/chromium/src/+/1c7a6ab994d205d84cab9cd153b16b0edec9efaf

Add CRLF to the beginning of boundary following binary MHTML part by jianli@chromium.org
https://chromium.googlesource.com/chromium/src/+/52c5105d39b7d838daabddf79f0aa4d225bfec27

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions

Comment 17 by noel@chromium.org, Apr 4 2018

Cc: -adenilso...@arm.com cavalcantii@chromium.org
Owner: ----
I don't think my patch "Add CRLF to the beginning of boundary following binary MHTML part" can cause the regression since it only kicks in when a MHTML file is being loaded and parsed.

Comment 19 by torne@chromium.org, Apr 16 2018

Cc: changwan@chromium.org
Components: Mobile>WebView
Status: Available (was: Assigned)
Yeah doesn't look like pinpoint is doing a good job here. WebView also doesn't have anything to do with pdfium.

Changwan, is this something you might be able to look at? I'm not familiar with this infrastructure and don't know how to track down where it regressed.
Cc: perezju@chromium.org
There is nothing much we can do other than bisecting this. cc'ing perezju@ in case he has some insight.

Comment 21 by torne@chromium.org, Apr 17 2018

Components: -Mobile>WebView Mobile>WebView>Perf
Cc: -noel@chromium.org -droger@chromium.org -elawrence@chromium.org -kbr@chromium.org -jianli@chromium.org -jam@chromium.org -tkent@chromium.org -agrieve@chromium.org -qin...@chromium.org -xiaoche...@chromium.org -rouslan@chromium.org -roc...@chromium.org -d...@chromium.org -tetsui@chromium.org -tikuta@google.com -yzshen@chromium.org -dtapu...@chromium.org -cblume@chromium.org -bsheedy@chromium.org -tiborg@chromium.org -dpranke@chromium.org -eirage@chromium.org -mlamouri@chromium.org -steimel@chromium.org -dcheng@chromium.org -sky@chromium.org -nedngu...@google.com -dtrainor@chromium.org -pdfium-c...@skia-buildbots.google.com.iam.gserviceaccount.com -asvitk...@chromium.org -thakis@chromium.org -steve...@chromium.org -garykac@chromium.org -nasko@chromium.org simonhatch@chromium.org
Owner: cavalcantii@chromium.org
Status: Assigned (was: Available)
-folks incorrectly blamed by pinpoint

From the range this seems to be the same as issue 814027, and squinting at the job results in #11 appears to confirm this. Thus assigning to cavalcantii.

Note: this one might be easier to test/reproduce since it does not depend on the internal repo.

+simonhatch FYI as an example where pinpoint did a poor job, but legacy bisect was able to find a culprit (after a few tries!).
Issue 814027 has been merged into this issue.
Cc: noel@chromium.org
Just to confirm: is this issue reproducible *only* in a Nexus 5?
Also: anyone familiar with what is being tested could provide a quick overview on how the test works and what is being measured?
Thanks for the investigation, perezju@!

cavalcantii@, this was observed both on Nexus 6 and on Nexus 5X:

https://chromeperf.appspot.com/group_report?sid=680363c0dc04f4f1b305d27bbd3f3f4f2d271a822b7e049b4ffe30d43d492637


What is being tested:
from https://cs.chromium.org/chromium/src/third_party/catapult/tracing/tracing/metrics/system_health/webview_startup_metric.html?q=webview_url_load_cpu_time&sq=package:chromium&l=26

This seems to track CPU time of WebViewBlankUrlLoadInterval event, which involves navigation time:

https://00e9e64bac2c13138b644d46317737e56a16178f1050468a75-apidata.googleusercontent.com/download/storage/v1/b/chrome-telemetry-output/o/load_chrome_blank_2018-02-27_00-29-14_23911.html?qk=AD5uMEu8enaDBJnsPnlm7qUohDriR9LHABkgR4VaKiQA8bPGFjv6A6WxZuZy5LMB_r4gxiFSzL_EcXKfyBNMdfoo8k5MEV71SMCy7-V1-PZDjUce2UTw5lTzk2n330leLArX-kwAti5pTl9Kq4Bo07I42tCz4ROVBEpMQ2Ppzq_hqvUDaAkIDE2undacOU3Dy5tvg5SEskgFZ-m3rPyrWMzEWWEt6YeSMC8ZLKo5bzRuPicWafEiHoGBG4WKW5y2HOAFsfpxRkFW07Vq7NpfWw6JsMs8uTcUsVNzHHQnVASfAUedlOYhZ8atNIrdzdb7HjemXqVMSbaqclJhaxWD9PNFChlMjgHfBvN1v0kFqEky5F87-0w_4ciR0ldVAzg_bRoHbonnoxxg9R2vi6yz7BrjQMkfl8xR7a-jfnQBKev_a_QT-hyKcxBUXJBabv1Uzs4nzm7-j5emR4hbaKPYPAVgadhNI8dAlTUK0jGXUi11m7lpdaDUbiLm1htMGQuD4yeoNefhWQhZXB6ku2YvFIOnNATosxcTx-oVtva4j0JxFePklK38Ofayn3onEXgNmT5luwO6bPV6IzMJeqgV4ny2_QiY92kzpfYloIpvVad6qwawpxa-baphjt9y3gfoGA2GbILrFiVwnmali5LzlRUnWwTGvTgCXViadsB_8_f9RLW2EdHXWUT-5dRjBEpSkBSPCHqX4EJcfAGbH6bId_apHAV45Xdv7bnbELgwCY62Gwohp1efuyzXzwe89op67AoM0fw7rKFraHf6Qtbp_CbGsx8DLsR_AlfFOQ9Np0jwvZm8n4Y6bs8

(You can also click on any vertext from the graphs above, and click through 'View trace from this run' link.)

This is somewhat subtle, though - tobiasjs@ and torne@ could you explain the significance of this metric?

Clicking on 'View trace from this run' returns:
Forbidden
Error 403
Re 28, could you try again with a Google account?
I'm afraid I can't, as I'm an external contributor.

I got WebViewInstrumentation.apk to build with ToT and installed in a Nexus 6, but it seems that the device has to be rooted to run the benchmark? i.e. "AssertionError: Android device must be rooted to run Telemetry"

Link: https://gist.github.com/Adenilson/92ae375e5fe227ae869602e1a19e3a95

perezju@, do you know a way to grant perf trace access to an external contributor?

According to https://github.com/catapult-project/catapult/blob/master/telemetry/docs/run_benchmarks_locally.md,

yes, you need root privilege.
@Changwan: there are quite a few ways to root an Android device. Is there a recommended approach used by telemetry?

I'm trying to reproduce correctly the device configuration to be able to reproduce the bug.
Our test devices are just OEM-unlocked and flashed with userdebug builds of android (the only supported/legit method). Google don't provide prebuilt userdebug images publicly, however, so you'd have to build your own from AOSP or find one somewhere :|

How confident are we that this is related to cavalcantii's change? This is going to be a nontrivial amount of pain for an external contributor to repro and test.
Cc: cblume@chromium.org
Labels: Performance-Startup
Re 27, 31, the (unobfuscated) links to some traces on a Nexus 6:

(before)
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/load_chrome_blank_2018-02-15_23-32-57_73826.html

(after)
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/load_chrome_blank_2018-02-16_08-00-07_7580.html

+nednguyen for access to chrome-telemetry-output for an external contributor.

For convenience I've just downloaded those two files and attached them here.

> How confident are we that this is related to cavalcantii's change? This is going to be a nontrivial amount of pain for an external contributor to repro and test.

Pretty confident. It was also confirmed by bisect on internal perf bots where it registered a clear 7% increase in startup time.

noel and +cblume, since you reviewed r537179, could you assist cavalcantii debugging this regression?
before.html
4.6 MB View Download
after.html
4.6 MB View Download
As I've explained in (https://bugs.chromium.org/p/chromium/issues/detail?id=814027), there is indeed a potential for regression on initial startup on ARMv7/ARMv8 given that now we got perform CPU features detection (that is blocked in a pthread_once call).

The ARMv8 crypto extensions are optional on the first release of the architecture, becoming mandatory in the second release (i.e. ARMv8.1). Even though the majority of Android devices featuring an ARMv8 processor released in the last 3 years (2015 onwards) have the crypto extensions, we have to perform the check, there is no way around it.

It certainly doesn't help the fact that Android will either load/parse /proc/cpuinfo or instead do a dload/dlopen to perform this task (i.e. android_getCpuFeatures()).

This startup price is paid back with interest given that using the ARMv8 CRC32 instruction allows zlib dependent code (e.g. cronet + Chromium) to decompress gzip content in average 29% faster (for reference, please see: https://docs.google.com/spreadsheets/d/1l2wSfsLdAsDDVEZTXhq7hlF9074Us6GH3R7Pru2gd30/edit?pli=1#gid=595603569). 

That means the majority of grade A websites (e.g. google, cnn, bbc, etc) that serves compressed content (i.e. content-encoding: gzip) will benefit from the optimization introduced in the original patch.

What is a bit surprising is that this particular test may be triggering somehow the CPU feature detection code. My understanding is that the test should not be doing any network operations or loading any content (please correct me if I'm wrong?). Unless there are associated assets (e.g. PNGs) that are loaded when a WebView is spawned?

My guess is that introducing an assert around the cpu feature detection code (i.e. https://cs.chromium.org/chromium/src/third_party/zlib/arm_features.c?l=34) could pinpoint who is calling it. But as reported by @torne, it requires some special hardware/software combination to be able to run the test.

Maybe an alternative would be to re-frame the problem.

Looking at the bots, it seems that the absolute time detected in the regression is around 1ms (nexus 5x) to 2ms (nexus 6).

An alternate way to verify the theory (i.e. CPU feature detection comes with a startup price) would be to measure the time spent in zlib at startup to execute the detection.

I can collect this data in a few ARM devices and report back.

I've tested a few devices (Nexus4, Nexus6, rock64) and it seems that the initial startup time (i.e reading the CPU features support) is in average 580uSecs (or 0.58ms).

Subsequent calls will take between 0.86uSecs to 4.0uSecs, YMMV thanks to a few factors (e.g. battery, etc).

Assuming a startup time of 600uSecs (or 0.6ms) and an worst case of 3uSecs per call into the CPU features detection code, it would take another 100-150 calls into zlib to reach the 1ms mark observed into the regression. And around 400-500 calls for the 2ms regression.

Keep in mind that this numbers are for executing the code as a standalone app (i.e. zlib_bench), I'm unsure if further delays may be introduced by client code in cronet/WebView and IPC.

It seems that there is something else going on *besides* the CPU features detection, as that *alone* doesn't seem to be enough to explain the 1ms to 2ms observed in the regression (i.e. IIRC, the network code uses zlib's crc32 as checksum for its data cache).

Re-posting the question I made previously maybe someone more familiar with Android's webview could help: when a webview start with a blank page (like what this test does), will it load any resources (either from the network or from the cache)?

Cc: gsennton@chromium.org tobiasjs@chromium.org
+ a few more webview folks, could you help with the questions in #35, #36?
I tried to reproduce the running environment, considering the requirements:
a) Android
b) Root access

For that, I got an ARM board (rock64, https://www.pine64.org/?page_id=7147) that runs a recent version of Android (7.1.2) and is rootable.

Installed the apks (i.e. WebViewInstrumentation/Test, SystemWebView, etc) of an android release build 
(https://gist.github.com/Adenilson/9681f4ebec11fc7c7e52892303adc16c) in the board with no problems.

Next I figured it out a way to run the benchmark script connected to the board using the network (https://groups.google.com/a/chromium.org/forum/?fromgroups#!topic/benchmarking-dev/J379ADZ444w), as this board won't support USB host mode. It seems that this user case (i.e. run the script with ADB through the network) wasn't reported previously in benchmarking-dev.

Finally, I was able to run the script (i.e. ./tools/perf/run_benchmark
run system_health.webview_startup --browser=android-webview-instrumentation) after patching catapult (https://gist.github.com/Adenilson/8d4048b82f7c224d71001366bab60619), as this board doesn't have a battery and battery_utils was misreporting the CPU temperature thus getting stuck on cool down.

I had 4 runs with vanilla ToT and 3 runs with a patched version (https://gist.github.com/Adenilson/e1630cb1573f904910b43003fb0b7e33)
disabling the ARM cpu feature detection.

The results are not clear, I expected an abrupt reduction (6 to 10%) in the metric
StartupMetric_duration but it seems to observe the opposite (please
see attached the screenshot).

Given the very specific requirements to be able to reproduce this bug (i.e. OEM unlocked device with user debug image), I'm making this bug available to someone with the proper resources to resume the investigation.

test_runs.png
217 KB View Download
Owner: ----
Status: Available (was: Assigned)
"USB host mode" -> "USB client mode"

Sign in to add a comment