Issue metadata
Sign in to add a comment
|
2.5% improvement in webrtc_perf_tests at 13014:13014 |
||||||||||||||||||||
Issue descriptionThis is an improvement that comes from https://chromium.googlesource.com/external/webrtc/+/729b21f97f3d849b1ef2bd61114e4b39d073884d.
,
Jun 8 2016
kwiberg@, peah@, FYI: Slight improvement in APM performance due to "Add clz functions".
,
Jun 8 2016
This looks expected, right? As I see it: 1) The __builtin_clz?? intrinsics are available on mac, right? In that case, those should reduce the complexity. 2) The new code includes short-circuits when the input is 0, which should lower the complexity significantly for this case. If zero values dominates where the code is used, that is likely to give a lower complexity. kwiberg@: WDYT? Does this make sense?
,
Jun 8 2016
Yes, __builtin_clz?? is implemented by clang and gcc everywhere. They use supposedly good implementations for each platform; on at least ARM and x86, they're implemented with a single instruction. I don't think that (2) helps much when we already have (1). Since the old implementation was significantly longer than a single instruction and IIRC even had branches, I'm not surprised that the new one is faster. I didn't think we used it enough that it would be visible in an entire benchmark run, though. See also issue 617123 . |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by hlundin@chromium.org
, Jun 8 2016