New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 750862 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

10.5% regression in smoothness.simple_mobile_sites at 489986:490179

Project Member Reported by majidvp@google.com, Jul 31 2017

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Jul 31 2017

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=750862

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=212f8c29dd1791d1e3326affe03aac23056daee06544ce49b6f8ab774fbdaf56


Bot(s) for this bug's original alert(s):

android-nexus5
Cc: khushals...@chromium.org
Owner: khushals...@chromium.org

=== Auto-CCing suspected CL author khushalsagar@chromium.org ===

Hi khushalsagar@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Khushal
  Commit : ca2c1cf2d9748d25050bfe7f71ee40463588a6a9
  Date   : Thu Jul 27 23:06:43 2017
  Subject: cc: Split pending tree duration metrics based on main/impl side tree.

Bisect Details
  Configuration: android_nexus5_perf_bisect
  Benchmark    : smoothness.simple_mobile_sites
  Metric       : mean_input_event_latency/http___m.nytimes.com_
  Change       : 12.03% | 14.916 -> 16.7101666667

Revision             Result                   N
chromium@489985      14.916 +- 0.838693       6      good
chromium@490034      14.8377 +- 0.776302      6      good
chromium@490058      15.0652 +- 0.777163      6      good
chromium@490061      15.0363 +- 0.438474      6      good
chromium@490062      15.1995 +- 1.08912       6      good
chromium@490063      16.1807 +- 1.49517       9      bad       <--
chromium@490064      16.8577 +- 0.683697      6      bad
chromium@490070      16.5393 +- 0.780255      6      bad
chromium@490082      16.8185 +- 1.08621       6      bad
chromium@490179      16.7102 +- 0.970654      6      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http...m.nytimes.com. smoothness.simple_mobile_sites

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8972519715176197280


For feedback, file a bug with component Speed>Bisection
Cc: briander...@chromium.org
+Brian.
Fix in review: https://chromium-review.googlesource.com/c/595112/.

The change above fixed a bug in the ready to activate duration measurements which decreased how often we were prioritizing impl thread latency. As it turns out, this was masking a different error in deciding the activation deadline for whether the critical main frame is fast enough. Just needs a fix for excluding the draw duration from the total interval for the frame when deciding the activation deadline.
Project Member

Comment 6 by bugdroid1@chromium.org, Aug 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d84008441c63814a737ff95047f4ce2c7aa3302c

commit d84008441c63814a737ff95047f4ce2c7aa3302c
Author: Khushal <khushalsagar@chromium.org>
Date: Tue Aug 08 01:40:11 2017

cc: Exclude draw duration in BeginMainFrameToActivate deadline.

For BeginMainFrameToActivate to be fast, the activation deadline should
exclude the draw duration from the interval for this frame. The main
frame can be considered fast only if we can draw the activated tree
within the deadline.

Bug:  750862 
Cq-Include-Trybots: master.tryserver.blink:linux_trusty_blink_rel
Change-Id: I902affc0f3ba33437259a6735e95bb479a78b6c0
Reviewed-on: https://chromium-review.googlesource.com/595112
Commit-Queue: Khushal <khushalsagar@chromium.org>
Reviewed-by: Brian Anderson <brianderson@chromium.org>
Cr-Commit-Position: refs/heads/master@{#492497}
[modify] https://crrev.com/d84008441c63814a737ff95047f4ce2c7aa3302c/cc/scheduler/scheduler.cc
[modify] https://crrev.com/d84008441c63814a737ff95047f4ce2c7aa3302c/cc/scheduler/scheduler_unittest.cc
[modify] https://crrev.com/d84008441c63814a737ff95047f4ce2c7aa3302c/cc/test/scheduler_test_common.h

Status: Fixed (was: Untriaged)

Comment 8 by majidvp@google.com, Aug 8 2017

khushalsagar@ how did you confirm that the issue is actually fixed? Looking at the graph the metric now seems to alternate between the pre-regression value (~14ms) and post-regression value (~16ms). Is this expected?
Cc: enne@chromium.org
Status: Assigned (was: Fixed)
I ran the test locally to confirm but doing repeated runs of it now, I do see that the value oscillates between ~14 to ~17ms.

The change that caused the regression was actually a fix for tracking the duration of pending tree activations. For this case, now the value of commit_to_ready_to_activate is close to ~8.5ms, so depending on the main frame durations, the impl thread may or may not prioritize impl thread latency and trigger the deadline immediately. Earlier this value was ~15ms so the impl frame would always be triggered immediately at the beginning of the deadline.

As far as the current code is concerned, this is working as intended. But this points out one potential case to optimize, should we prioritize impl thread latency if the main frame is just aborting commits? We would already do this if there is no scroll handler on the page. If commits are being aborted, then the page is not using the scroll updates to make any visual changes, so we could treat this similar to a no handler case and go into impl thread priority mode?
Cc: sunn...@chromium.org
 Issue 750861  has been merged into this issue.
Project Member

Comment 12 by 42576172...@developer.gserviceaccount.com, Aug 21 2017

 Issue 750705  has been merged into this issue.
Project Member

Comment 13 by 42576172...@developer.gserviceaccount.com, Aug 22 2017

 Issue 750705  has been merged into this issue.
The following regressions have been duped into this bug:

1) smoothness.top_25_smooth(win): frame_times are up by ~2ms in some cases.

2) smoothness.tough_filters_cases(win): frame_times up from ~46.5 to ~47.75ms.

3) smoothness.tough_animation_cases(win): frame_times up from ~18.6ms to ~18.7ms.

4) smoothness.simple_mobile_sites(android): mean_input_event_latency on nytimes which now oscillates between the pre-regression value(~14.5) and the current value (~16.5).

1) looked like the most concerning, and taking a look at the traces I see a change in the number of impl frames coming from the renderer. Earlier, for 281 vsyncs there are 246 CompositorFrames. After the change, for 280 vsyncs there are 221 CompositorFrames. May be its the scheduler logic for latency recovery kicking in (https://cs.chromium.org/chromium/src/cc/scheduler/scheduler.cc?dr&l=408)? I'm not sure what the best way to address this is, since the logic there hasn't changed.
Status: Fixed (was: Assigned)
There are tradeoffs with Khushal's patches, but they are feeding the scheduler better information than before.

Main thread latency has improved in some cases (see A) at the cost of compositor throughput in others. All the alerts only show up on what appear to be corner cases that were hitting just under 60fps as it was. There don't appear to be any unexpected regressions, so let's close this as Fixed.

A) https://chromeperf.appspot.com/report?sid=6a438c179d89a5acb4174612894ec3083778e270f696d6afb606dfcfebe32311&start_rev=484182&end_rev=496297
Project Member

Comment 16 by 42576172...@developer.gserviceaccount.com, Aug 23 2017

 Issue 758024  has been merged into this issue.

Sign in to add a comment