New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 883751 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 25
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

11.6% regression in blink_perf.parser at 588253:588276

Project Member Reported by npm@chromium.org, Sep 13

Issue description

See the link to graphs below.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=883751

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=31a0315058934de2c837a6e5e2f016a2850f55d64428a76f00cb77f5ef7347ae


Bot(s) for this bug's original alert(s):

Android Nexus5 Perf

blink_perf.parser - Benchmark documentation link:
  https://bit.ly/blink-perf-benchmarks
Cc: afdo-chr...@skia-buildbots.google.com.iam.gserviceaccount.com
Owner: g...@chromium.org
Status: Assigned (was: Untriaged)
📍 Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/14632b6b640000

Roll AFDO from 70.0.3538.0_rc-r1 to 71.0.3539.0_rc-r1 by afdo-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com
https://chromium.googlesource.com/chromium/src/+/d3977f364f72a1269cb97fa6fb05ebc92b61a52a
819.7 → 723.6 (-96.1)

Assigning to sheriff gbiv@chromium.org because "Roll AFDO from 70.0.3538.0_rc-r1 to 71.0.3539.0_rc-r1" is a roll.

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions

Benchmark documentation link:
  https://bit.ly/blink-perf-benchmarks
It's interesting that this range of changes happens, and the benchmark scores instantly become substantially less noisy. There's also no obvious AFDO-induced noise in this benchmark's history, and no recovery in two weeks.

I know there was some android-y stabilization effort (by reducing CPU frequency to reduce thermal throttling, etc.) going on a few weeks ago. Will look into it and try to repro locally.
Cc: jbroman@chromium.org
+jbroman, mostly for the last paragraph.

No such luck; I got a reliable repro on my phone locally.

Comparing the `perf` reports for both before and after the roll, `blink::SelectorQuery::FindTraverseRootsAndExecute<blink::AllElementsSelectorQueryTrait>` goes from taking 3% of all cycles to taking 9%. The new profiles never mention this function, whereas the old one has a fair number of samples in it. So, clang considers this function cold on Android.

Given that the only user of this function is named `SelectorQuery::QueryAll` and the test is named `query-selector-all-class-deep`, this is likely the source of our regression. :)

Note that clang is only treating said function as cold on Android. On Linux, which uses these same profiles to guide optimizations, it'll be optimized as if we used no profile at all. This is because we ask clang to prefer binary size on Android, whereas we prefer speed on Linux.

The benchmark history here seems quite quiet since this regression: there's a blip of AFDO helping out after 3539 landed, but the 3555 profile roll -- which, given the perf bugs it gave me, I'm assuming is a flake -- stamped that out.

Our general stance is that individual blink microbenchmarks are great indicators of what may be going wrong when a higher-level metric regresses, but they don't provide great signal about how AFDO rolls affect Chrome's performance/snappiness as a whole. I count zero other perf bugs for the 71.0.3539.0_rc-r1 roll, so I'm leaning toward WontFix'ing this. Because a persistent 11% degradation is a lot, I'm tagging a blink_perf.parser owner to be sure that that's an acceptable course of action.
Interesting. We have separate fast paths for querySelectorAll for simple tag and class selectors. I guess that wherever our new profiles are coming from, there weren't many uses of qSA for other types of selectors.

Assuming we're confident that our new profile is representative of our real use cases, I suppose we can live with this.
Status: Fixed (was: Assigned)
Closing this out then. Thanks!

> Assuming we're confident that our new profile is representative of our real use cases, I suppose we can live with this.

At the moment, they're coming from running Chrome on a set of telemetry benchmarks (largely page cyclers, but with some smaller, more targeted benchmarks thrown in, like speeodmeter). We have plans to migrate to field-sourced profiles soonish, which should be highly representative of real use cases :)

Sign in to add a comment