New issue
Advanced search Search tips

Issue 688610 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

37.9% regression in media_perftests at 447761:447802

Project Member Reported by crouleau@chromium.org, Feb 4 2017

Issue description

37.9% regression in media_perftests audio_bus_to_interleaved int8_t at 447761:447802

Looks like a reversion to an earlier state, but it's worth it to run the bisect to understand better.
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=688610

Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?keys=agxzfmNocm9tZXBlcmZyFAsSB0Fub21hbHkYgIDgqOzbqgsM


Bot(s) for this bug's original alert(s):

chromium-rel-mac11
Cc: davidben@chromium.org
Owner: davidben@chromium.org

=== Auto-CCing suspected CL author davidben@chromium.org ===

Hi davidben@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : davidben
  Commit : c0fce12d818db7e176c4f028b57fcc9b72138b62
  Date   : Thu Feb 02 17:34:56 2017
  Subject: Roll src/third_party/boringssl/src 358baeb9a..c26692cfd

Bisect Details
  Configuration: mac_10_11_perf_bisect
  Benchmark    : media_perftests
  Metric       : audio_bus_to_interleaved/int8_t
  Change       : 38.03% | 20.9654083333 -> 28.9376

Revision             Result                   N
chromium@447760      20.9654 +- 0.489942      6      good
chromium@447781      21.0208 +- 0.311495      6      good
chromium@447787      20.9834 +- 0.233201      6      good
chromium@447788      28.9237 +- 0.261666      6      bad       <--
chromium@447789      29.3864 +- 0.237034      6      bad
chromium@447790      29.1531 +- 0.904292      6      bad
chromium@447792      28.8445 +- 0.18739       6      bad
chromium@447802      28.9376 +- 0.169635      6      bad

To Run This Test
  ./src/out/Release/media_perftests --single-process-tests

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8988631661426820912

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=5218940206186496


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Tests>AutoBisect.  Thank you!
Owner: crouleau@chromium.org
This makes very little sense. The BoringSSL roll would change crypto, and this is something relating to audio. Please reinvestigate.
Cc: dalecur...@chromium.org
Components: Internals>Media
It doesn't make sense to me either. The automated system assigned this to you, not me. The tests are very confident in this: before the roll, the average value for a run was 20.98, after the roll it was 28.92. The +/- values are standard deviations. 

I think that the first step is to manually verify the results. The system may have a bug. If the results are verified, we can manually bisect into the roll and figure out which specific CL was the culprit.

+Dale FYI
Oh, huh, I guess you do link to //crypto in there. Interesting. Though I don't think that revision changed anything you're using, based on a cursory code search.

I'm not familiar enough with this benchmark, so I guess what'd make sense is for a media person to investigate and, if it is indeed crypto, find out what piece it was, and then I can take it from there.
What I find really interesting about this bug is that out of windows, linux, and mac, the regression only affects mac: https://chromeperf.appspot.com/report?sid=f22ddcb96358fd4019b10b95ae04bca8aa9cd35ea51a05495dc98ee1beb7e8a4


I'm not reproducing the results manually, but I think it might be because I'm using dynamically linked debug build instead of a production build. Here's the data I get so far:

for 448157 (ToT), I get
  *RESULT audio_bus_to_interleaved: int8_t= 95.642 ms

for 447787 (6b0575da103f03857ec7ca23bb57a2203f7cbfb5), I get
  *RESULT audio_bus_to_interleaved: int8_t= 95.23225 ms

for 447788 (c0fce12d818db7e176c4f028b57fcc9b72138b62, I get
  *RESULT audio_bus_to_interleaved: int8_t= 95.71485 ms


I'm going to figure out which build flags the bisect used and use those and retry.
bisect results are fairly conclusive, so I'd just write this off as an infrastructure or linking change. 
Based on offline discussion with Dale, we decided that it was worth trying with the same buildflags as the bisect. For some reason I could not find any example output for any of the bisect runs.

Instead, I ran

python tools/mb/mb.py isolate out/Release -m chromium.perf -b "Mac Builder" media_perftests

And it gave me the following output:

Writing """\
is_chrome_branded = true
is_debug = false
is_official_build = true
use_goma = true
""" to /Users/crouleau/chromium/chromium/src/out/Release/args.gn.

So I'm pretty sure those are the correct buildflags. 

Now I'll see if it repros again.
Alright, I got it working with the real build args.

For 447788 (Right after the roll was committed), I got
  *RESULT audio_bus_to_interleaved: int8_t= 23.58015

For 447787 (Right before the roll was committed), I got
  *RESULT audio_bus_to_interleaved: int8_t= 17.43405

This confirms that the regression is real. 

The roll contains 42 commits, and roundUp(log2(42)) == 6, so it should be pretty quick to bisect within the roll. I'm doing this now.



Cc: crouleau@chromium.org
Owner: agl@chromium.org
Assigning to agl@.

TL;DR: The culprit of this regression is https://boringssl.googlesource.com/boringssl/+/3f38d80b2ffc7979b26d4bf8b750dd7088b5cc04


I figured out how to bisect within the roll. The trick is that the roll also contains compiled files created by the roll_boringssl.py. Since roll_boringssl.py does more work than I need, I do it manually instead:

vim DEPS  # edit with the hash for boringssl
gclient sync

cd third_party/boringssl
rm *.gni err_data.c
rm -r linux-* mac-* win-*
python src/util/generate_build_files.py gn

This didn't work right away, but I eventually realized that I was missing "go", which is a dependency of generate_build_files.py

Then I build and run media_perftests.  

For chromium 447788 (Right after the roll was committed), I got
  23.58015 ms

For boring 7cd0a97 (Bogo: Send and receive 0.5-RTT data.), I get
  23.08245 ms

For boring 3f38d80 (Add CFI information to the x86-64 X25519 asm), I get
  22.98395 ms

For boring 8c2480f (Push to error queue in |EVP_PKEY_CTX_ctrl| for wrong keytype.), I get
  17.29265 ms 

For boring 276b7e8 (Move optional message type checks out of ssl_get_message. by David Benjamin), I get
  17.84985 ms

For boring 258508f (Adding V2ClientHello counter. by Steven Valdez), I get 
  17.43105 ms

For chromium 447787 (Right before the roll was committed), I got
  17.43405 ms

To be sure, I double-checked 8c2480f and 3f38d80, and results were consistent.

Comment 13 by agl@chromium.org, Feb 7 2017

Since this is a Mac-only change, I'm unable to repro it I'm afraid. I also note that this test suddenly got faster at 446136 and this is it going back to the exact speed as before. Sadly there isn't more history, but it would make sense that this is an alignment issue and changes in other bits of the code are just perturbing that alignment.

(An interesting test would be to take the .cfi_* directives out of the assembly file changed in 3f38d80. Those directives don't affect the code, but they will still affect alignment before of a change of unwind segment size.)
Cc: jrumm...@chromium.org chfremer@chromium.org
+chfremer@ since most recent SWE to make significant changes to https://cs.chromium.org/chromium/src/media/base/audio_bus.h?l=291&gsn=CopyConvertFromAudioBusToInterleavedTarget

+jrummell also knows about audio_bus.cc

It does seem like this is a metric that we care about optimizing. We are working on things like  issue 619628  to improve its performance. It seems like we should try to figure this out.
Cc: -jrumm...@chromium.org agl@chromium.org
Owner: jrumm...@chromium.org
Assigning to jrummell@, who will bring in his Mac to work on this tomorrow. John is pretty sure this is caused by the test not aligning the audio data, whereas in real scenario the data would be aligned, but he's going to double-check this.
John figured out (with minimal help from me) that
1. The audio data was already aligned
2. The difference in audio_bus_to_interleaved time between using aligned audio data and using forcibly un-aligned audio data is significant, but no where near the 37.9% that we're seeing here. This difference in time taken when using un-aligned versus when using aligned audio data exists both before agl@'s change and after his change.

Unaligned before change: 17.40 ms
Aligned before change:   16.90 ms

Unaligned after change:  23.92 ms
Aligned before change:   22.78 ms

The above results are consistent on repeated runs.
Did you try the suggestion in comment #13?

Also, what is this test actually doing? I'd be surprised if it were calling into X25519 at all.
The suggestion in #13 had no effect on the output values. I have attached the changes that I made for #13 in a patch file.

John could answer better for a description of what the test is doing. The code is at https://cs.chromium.org/chromium/src/media/base/audio_bus_perftest.cc, and the following commands run it:

ninja -C out/Release -j 200 media_perftests
out/Release/media_perftests --single-process-tests --gtest_filter=AudioBusPerfTest.*

Make sure your build args are as follows: 
is_chrome_branded = true
is_debug = false
is_official_build = true
use_goma = true
0001-Removing-.cfi_-directives.patch
4.0 KB Download
Come to think of it, that assembly file isn't even used in Chromium; Chromium does an OPENSSL_SMALL build. The static linker should be dropping it. This doesn't make any sense.
Yeah, it doesn't make any sense to us either. I would encourage you to reproduce it for yourself. It is easy to repro.
Cc: -crouleau@chromium.org jrumm...@chromium.org
Owner: crouleau@chromium.org
Status: WontFix (was: Assigned)
After discussing with Dale, we have decided that this bug is WontFix because we rarely use int8_t audio; mostly we use int16_t audio. Dale is doing more investigation into audio usage now, and he will see if removing this test makes sense.
Cc: tguilbert@chromium.org chcunningham@chromium.org crouleau@chromium.org
 Issue 703237  has been merged into this issue.

Sign in to add a comment