New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 739303 link

Starred by 7 users

Issue metadata

Status: WontFix
Owner:
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocked on:
issue 741714

Blocking:
issue 735802



Sign in to add a comment

1%-22.5% regression in smoothness.gpu_rasterization.tough_path_rendering_cases at 483316:483448

Project Member Reported by rmcilroy@chromium.org, Jul 5 2017

Issue description

See the link to graphs below.
 
Cc: k...@chromium.org
Owner: k...@chromium.org

=== Auto-CCing suspected CL author krb@chromium.org ===

Hi krb@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Kevin Bailey
  Commit : 3dc51db95f327e4f86962c94d6d5455ee75cf433
  Date   : Thu Jun 29 14:43:43 2017
  Subject: [omnibox] Select all when we revert all

Bisect Details
  Configuration: winx64_high_dpi_perf_bisect
  Benchmark    : smoothness.tough_webgl_cases
  Metric       : frame_times/http___kenrussell.github.io_webgl-animometer_Animometer_tests_3d_webgl.html
  Change       : 24.83% | 101.17461779 -> 126.297406433

Revision             Result                  N
chromium@483319      101.175 +- 4.96642      6      good
chromium@483344      101.757 +- 2.34893      6      good
chromium@483356      100.505 +- 3.12129      6      good
chromium@483359      101.183 +- 2.42339      6      good
chromium@483361      106.244 +- 15.8564      6      good
chromium@483362      129.72 +- 5.338         6      bad       <--
chromium@483368      123.032 +- 4.75483      6      bad
chromium@483416      126.297 +- 4.52276      6      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http...kenrussell.github.io.webgl.animometer.Animometer.tests.3d.webgl.html smoothness.tough_webgl_cases

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8974917928249046496


For feedback, file a bug with component Speed>Bisection

Comment 4 by kbr@chromium.org, Jul 6 2017

Components: -Internals>GPU UI>Browser>Omnibox Internals>GPU>Internals
Talked with krb@ offline -- need to do two runs locally, one with top of tree, and one with 3dc51db95f327e4f86962c94d6d5455ee75cf433 reverted. Let's look at mean_frame_times for the WebGL Animometer case for both. If it looks like there is a real performance regression then maybe the Omnibox is now rapidly triggering browser repaints, or something similar.

Comment 5 by k...@chromium.org, Jul 6 2017

Cc: -k...@chromium.org kbr@chromium.org
You said, "Let's" so I assume you want to be cc'd. Sorry if that's presumptuous.

I suspect this is what you're looking for, from the 2 runs I did this
morning:

  "charts": {
    "mean_frame_time": {
      "summary": {
        "std": 0.0,
        "name": "mean_frame_time",
        "improvement_direction": "down",
        "important": true,
        "values": [
-          30.563   
+          35.171
        ],
        "units": "ms",
        "type": "list_of_scalar_values",
        "description": "Arithmetic mean of frame times."
      },

This is from Linux. I'm still working on the Windows version.

I can't imagine how this change would trigger a lot more redraws, and I added some logging to verify that the paint routines for the elements in question weren't getting explicitly called more. In case I'm looking at the wrong routines, is there something specific Omnibox-related that anyone would like to see numbers on ?

Comment 6 by kbr@chromium.org, Jul 7 2017

Status: Assigned (was: Untriaged)
Fine to CC: me.

Please confirm which number is from which run -- i.e., from top-of-tree, or with 3dc51db95f327e4f86962c94d6d5455ee75cf433 reverted.

Please gather 5 runs of each so we can get an idea locally of how noisy this benchmark is on your machine.


Comment 7 by k...@chromium.org, Jul 7 2017

Both are ToT. 30.563ms is with my change removed via a patch. (Think "before and after".)

I will get the additional samples tomorrow morning.

Comment 8 by kbr@chromium.org, Jul 7 2017

If the per-frame time is reliably 35 ms with your change in the code base and 30 ms without it, then there's clearly been a huge performance regression and you should revert it ASAP while the cause is being investigated.

I suspect that the Omnibox is triggering a lot of UI repaints where it wasn't before.

These flags might help:

/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --user-data-dir=/tmp/t1 --show-paint-rects --ui-show-paint-rects

I'd also suggest gathering traces with and without your patch via about:tracing.

Comment 9 by k...@chromium.org, Jul 7 2017

Here are the numbers. I got 2 sets of 5 for each build:

before {32.286, 34.674, 34.475, 31.52, 31.227} = 32.8544 +- 1.48493
       {32.685, 32.375, 34.221, 36.37, 33.727} = 33.8756 +- 1.41653
after  {36.275 30.792 37.2 31.966 34.517}      = 34.15   +- 2.4494
       {32.901 36.658 30.639 33.516 30.018}    = 32.7464 +- 2.35805

I don't like the SD on the change, but it doesn't look to me like the average has changed. Again, this is Linux; let me get the Windows numbers.

Comment 10 by k...@chromium.org, Jul 7 2017

The Windows numbers are much worse: ~105 vs ~130ms. I tried rearranging a few things but it doesn't get any better. I've no idea why the time is so different when we're just, essentially, moving a call from one place to another.

I also tried getting a trace, but I don't see any Omnibox events, despite checking the box.

I'll just revert it.

Comment 11 by kbr@chromium.org, Jul 7 2017

OK. Thanks for investigating this. Again, I suspect that your change is causing the browser's UI to repaint a lot more often for some reason. If you can dig into why then we can re-land your patch.

Comment 12 by k...@chromium.org, Jul 7 2017

Sure, and thanks for the suggestions. Unfortunately, I'm no closer to understanding why this is happening. Is "mean_frame_time" for a single navigation or per-frame (several times per second) ?

I verified that it is the SelectAll() in OmniboxViewViews::Update() that triggers it, and that it is only called once, for the navigation itself (in case Peter has any theories.)

Comment 13 by kbr@chromium.org, Jul 7 2017

mean_frame_time is per-frame, averaged across the entire run of the benchmark. The increase seen on Windows from 105 ms to 130 ms is huge, and should be easily visible if you gather a trace from about:tracing with type "Javascript and Rendering". I'd suggest you try viewing the benchmark page in your browser build outside the benchmark harness, gathering traces before and after your patch, and see if anything obvious comes up. You ought to be able to easily see the browser, renderer and GPU processes' work on a per-frame basis (start with a clean browser session with no other tabs). In fact if you'd like to attach before-and-after traces here, I and others may be able to help diagnose them.

Comment 14 by k...@chromium.org, Jul 10 2017

To make sure I'm doing this correctly, I went to chrome://tracing, clicked on 'Record', then 'Javascript and rendering', then editted the selections by adding 'omnibox' and 'views'. I then hit 'Start', and flipped to a tab where I had the WebGL test cued up and hit enter. I let it run for about 10 seconds before stopping the recording.

When I zoom in to about 200ms, I don't see any blocks that are 30ms long (except for a couple things that are in the "good" trace too.) I'm not familiar with this data enough to see any differences, or out of place blocks either.

I have 3 traces, but it will only let me upload one at a time. I've attached the latest "bad" trace. If there's a better place to put the others, I'd be happy to upload them there.
trace_unconditional-select-all2.json.gz
4.8 MB Download

Comment 15 by kbr@chromium.org, Jul 10 2017

The 10 MB limit is per comment. Could you please upload a "good" trace to a new comment?

Comment 16 by k...@chromium.org, Jul 10 2017

Here is the "good" trace.
trace_conditional-select-all.json.gz
8.9 MB Download
Project Member

Comment 17 by 42576172...@developer.gserviceaccount.com, Jul 12 2017

Cc: pmeenan@chromium.org
 Issue 741719  has been merged into this issue.

Comment 18 by kbr@chromium.org, Jul 13 2017

Blockedon: 741714
Blocking: 735802
Cc: nedngu...@google.com briander...@chromium.org sunn...@chromium.org
Components: Speed>Metrics
Status: Started (was: Assigned)
sunnyps@ helped me look through not only these traces but also the ones from the original alert, specifically on:
ChromiumPerf/win-high-dpi/smoothness.tough_webgl_cases / mean_frame_time / http___kenrussell.github.io_webgl-animometer_Animometer_tests_3d_webgl.html

Trace before the alert:
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_8-2017-06-28_23-27-58-20273.html

Trace after the alert:
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_8-2017-06-29_15-31-05-14635.html

Sunny pointed out that the traces gathered by hand might not reproduce the original problem. They were so big that the browser had a difficult time loading both simultaneously, anyway.

After much analysis Sunny concluded that the frame_times and mean_frame_time metrics are flawed on Windows. These metrics seem to be looking at both BenchmarkInstrumentation::ImplThreadRenderingStats and BenchmarkInstrumentation::DisplayRenderingStats trace events on this platform. In the "good" trace, the browser UI is occasionally redrawing (presumably caused by the Omnibox), and drawing *more* frames than in the "bad" trace. Other than that, the times to draw WebGL appear unchanged. So krb, your CL is *improving* things by reducing the number of frames drawn! The additional frames aren't useful, but are skewing the frame_times and mean_frame_times metrics. These metrics are apparently computed differently on different platforms -- on Android, they are tied into the SurfaceFlinger -- which is why the frame_times and mean_frame_time metrics regressed only on Windows.

It also explains the dramatic improvement in these metrics on Windows for the IE Chalkboard benchmark, and the regression seen when krb's CL was reverted in 5604cb88fc88d2c3b7886702e8253b21ceea600e .

krb@, let's re-land your original CL and close this regression bug as WontFix. brianderson@ mentioned there was ongoing work to improve these metrics; we should reference that bug here.

A small correction to #18: The reason we don't see this regression on Android is probably because this code doesn't run there. However, benchmarks like this can be affected by spurious frames injected by the UI code on Android too. (I don't know if this happens today for sure.)

Comment 20 by kbr@chromium.org, Jul 13 2017

Thanks Sunny for clarifying. Then it's also somewhat of a mystery why this didn't show up on Linux (where Views is used), though since Mac has different Omnibox code, that may explain why the regression wasn't seen there.

Cc: tdres...@chromium.org
+Tim: fyi

Comment 22 by k...@chromium.org, Jul 13 2017

Ken and Sunny, thank you for tracking this down. It's good to get to the bottom of a mystery.

btw, I can't reland it quite yet, as there's a small change we need to make first.

https://bugs.chromium.org/p/chromium/issues/detail?id=738843

Comment 23 by k...@chromium.org, Aug 7 2017

Status: WontFix (was: Started)
Why is this wontfix? krb?
@24: See comment 18.

Comment 26 by k...@chromium.org, Aug 25 2017

Cc: vmiura@google.com k...@chromium.org
 Issue 759131  has been merged into this issue.

Comment 27 by k...@chromium.org, Sep 22 2017

Cc: primiano@chromium.org
 Issue 753982  has been merged into this issue.

Comment 28 by k...@chromium.org, Oct 10 2017

Cc: kraynov@chromium.org
 Issue 759618  has been merged into this issue.

Sign in to add a comment