New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 800448 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

1.5%-20.3% regression in smoothness.tough_animation_cases at 526960:527119

Project Member Reported by mustaq@chromium.org, Jan 9 2018

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=800448

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=61b06bc08a1149e13736745fcb7798ef10c5f8c4853bf43eeb2a2d23472e283f


Bot(s) for this bug's original alert(s):

android-nexus5X
android-nexus7v2
android-webview-nexus6
Cc: bsheedy@chromium.org
Owner: bsheedy@chromium.org
Status: Assigned (was: Untriaged)

=== Auto-CCing suspected CL author bsheedy@chromium.org ===

Hi bsheedy@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : bsheedy
  Commit : 2a59f25dbac3347aa8b4123e8efb682de2a9821c
  Date   : Thu Jan 04 17:48:01 2018
  Subject: Reland "Update Android NDK to r16"

Bisect Details
  Configuration: android_webview_nexus6_aosp_perf_bisect
  Benchmark    : system_health.common_mobile
  Metric       : timeToFirstPaint_avg/load_media/load_media_google_images
  Change       : 15.71% | 343.418166667 -> 397.354166667

Revision             Result                  N
chromium@527028      343.418 +- 22.9206      6      good
chromium@527029      406.956 +- 21.3866      6      bad       <--
chromium@527030      414.749 +- 22.7523      6      bad
chromium@527031      401.835 +- 21.2259      6      bad
chromium@527033      409.783 +- 25.5032      6      bad
chromium@527037      449.355 +- 234.879      6      bad
chromium@527046      405.26 +- 23.7569       6      bad
chromium@527063      397.948 +- 19.1653      6      bad
chromium@527097      397.354 +- 18.3402      6      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-webview --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.media.google.images system_health.common_mobile

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8957847029814032272


For feedback, file a bug with component Speed>Bisection
I wouldn't be surprised if the NDK upgrade caused a performance regression somewhere. However, I'm not quite sure how to handle this. I was the one that landed the NDK upgrade patch, but the regression is almost certainly caused by the NDK itself (code written by the Android team) and not in anything that actually changed src-side (that was largely adding some additional #includes and casts). Does that still leave the responsibility on me to track down the cause, or should it be handed off to someone more knowledgeable?

=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : bsheedy
  Commit : 2a59f25dbac3347aa8b4123e8efb682de2a9821c
  Date   : Thu Jan 04 17:48:01 2018
  Subject: Reland "Update Android NDK to r16"

Bisect Details
  Configuration: android_webview_nexus6_aosp_perf_bisect
  Benchmark    : system_health.common_mobile
  Metric       : timeToFirstPaint_avg/load_media/load_media_google_images
  Change       : 17.39% | 340.392333333 -> 399.598333333

Revision             Result                  N
chromium@527028      340.392 +- 17.7314      6      good
chromium@527029      403.745 +- 45.3856      6      bad       <--
chromium@527030      401.722 +- 17.4271      6      bad
chromium@527031      402.418 +- 19.4476      6      bad
chromium@527033      404.273 +- 22.3251      6      bad
chromium@527037      403.991 +- 17.7767      6      bad
chromium@527046      406.573 +- 28.6072      6      bad
chromium@527063      393.324 +- 26.1654      6      bad
chromium@527097      399.598 +- 19.7842      6      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-webview --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=load.media.google.images system_health.common_mobile

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8957846986702720096


For feedback, file a bug with component Speed>Bisection
Project Member

Comment 7 by 42576172...@developer.gserviceaccount.com, Jan 10 2018

Cc: hjd@google.com
 Issue 800735  has been merged into this issue.

Comment 8 by mustaq@chromium.org, Jan 10 2018

Cc: alancutter@chromium.org nednguyen@chromium.org charliea@chromium.org
cc-ing owners of affected tests.
It seems like this is almost certainly worth investigating: ` ~50ms regression in first meaningful paint isn't anything to sneeze at.

bsheedy@, my guess is that if you can think of a better contact person for the interaction between Chrome and the Android NDK team than yourself, I'd kick the bug over to them. If you don't know of one, then I'd recommend you contact someone on the NDK team and let them know about the performance regression and see what their response is.
Cc: tdres...@chromium.org kouhei@chromium.org
+Kouhei, Tim for the 50ms regression in first meaningful paint
The "get debug trace" button isn't working for me.

Is there another way to get traces with the required categories for computing FMP?
Cc: agrieve@chromium.org
+agrieve who's an owner of the NDK in Chromium, although my guess is that he also won't have a good idea about the underlying cause. Andrew, if that's the case, I'll get in contact with the NDK team.
Oh, thanks, not sure how I missed those.

It's hard to tell much without a trace with more categories. The change is currently attributed to the "other" category.


Yeah, I don't really have any insights either. 

Comment 16 by srhines@google.com, Jan 10 2018

Cc: srhines@google.com
From an email conversation with danalbert@ (NDK contact):
"In the absence of using our compiler, it's really unlikely that an NDK update caused your performance regression. Most of the NDK outside the compilers is just stub libraries.

It's possible that the regression would be coming from the STL or perhaps a change in the headers that is causing some functions to be inlined differently (inlines should only be provided in the case where the function was not available on your minSdkVersion though, so the difference there should be an inline or a compilation failure, not a regression)."

I believe there was a slight change to the Chromium inline tweaks we apply to the NDK, but I'm not familiar with exactly what changed. Andrew, WDYT?

gbiv@ also mentioned that the change to enable AFDO for all Chrome builds went in not too long before the NDK upgrade, so it's possible it might be interacting weird with the newer NDK version. Is there a good way to re-run the test before and after the NDK upgrade, but with clang_use_default_sample_profile=false set in the gn args?
There did look to be a lot of stdc++ differences between r12 and r16 of the ndk.

Given that libchrome.so grew from the ndk roll, I'd guess that the reduced-inlining patch isn't a major culprit (I'm assuming size increase means more inlined functions).

I'd guess the best next step is to try and look at how traces changed (#14). But I have next to no experience with that.
I took a look at the traces linked in #13, and while I admittedly have no idea what I'm looking at, I did see something slightly odd - before the regression, the firstpaint event actually shows up within the first Paint event, but afterwards, it shows up a good 70 ms after the first Paint actually finishes - is this expected to be possible? As someone without any knowledge about what's actually going on under the hood, I would expect that firstpaint should actually be contained within the first paint.
"First Paint" includes time up to the compositor commit - it doesn't just include main thread paint.
I filed  issue 801336  about trace debugging button
FYI the "request debug trace" should work now.
Owner: ----
Status: Available (was: Assigned)
Unassigning since I'm not actively working on this and I'm clueless about how to diagnose the root cause for this. 
Components: Internals>GPU>Metrics
Cc: -mustaq@chromium.org
cc-ing the test owner for opinion: does the regressions in load_media_google_images deserve attention at all?  I assume other matrices in timeToFirstContentfulPaint_avg/load_media/ (if any) are fine so we have no bugs for them.  If this is the case, I can close this bug as WAI.
Status: WontFix (was: Available)
WontFix-ing due to lack of owner response.

Sign in to add a comment