New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 618239 link

Starred by 0 users

Issue metadata

Status: WontFix
Owner:
Closed: Jun 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

2.5% improvement in webrtc_perf_tests at 13014:13014

Project Member Reported by hlundin@chromium.org, Jun 8 2016

Issue description

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=618239

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgICg3NCmuwoM


Bot(s) for this bug's original alert(s):

webrtc-mac-large-tests
Cc: kwiberg@chromium.org peah@chromium.org
Labels: -M-51 M-53
Status: WontFix (was: Assigned)
kwiberg@, peah@, FYI: Slight improvement in APM performance due to "Add clz functions".

Comment 3 by peah@chromium.org, Jun 8 2016

This looks expected, right?

As I see it:
1) The __builtin_clz?? intrinsics are available on mac, right? In that case, those should reduce the complexity.
2) The new code includes short-circuits when the input is 0, which should lower the complexity significantly for this case. If zero values dominates where the code is used, that is likely to give a lower complexity.

kwiberg@: WDYT? Does this make sense?
Yes, __builtin_clz?? is implemented by clang and gcc everywhere. They use supposedly good implementations for each platform; on at least ARM and x86, they're implemented with a single instruction.

I don't think that (2) helps much when we already have (1).

Since the old implementation was significantly longer than a single instruction and IIRC even had branches, I'm not surprised that the new one is faster. I didn't think we used it enough that it would be visible in an entire benchmark run, though.

See also  issue 617123 .

Sign in to add a comment