Issue metadata
Sign in to add a comment
|
15% regression in media_perftests at 616339:616517 |
||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Jan 2
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/16c069f8940000
,
Jan 3
📍 Found significant differences after each of 2 commits. https://pinpoint-dot-chromeperf.appspot.com/job/16c069f8940000 Reland "Reland "allocator: Add Windows _aligned_* shims"" by vtsyrklevich@chromium.org https://chromium.googlesource.com/chromium/src/+/cecda97aff81d3eed1b3522f82d40b7984558fe8 audio_bus_to_interleaved: 29.98 → 33.5 (+3.517) Add support for weighted training examples. by liberato@chromium.org https://chromium.googlesource.com/chromium/src/+/d994d0d082b4c720e328d54ba60470f44154913a audio_bus_to_interleaved: 33.5 → 34.16 (+0.6547) Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Jan 3
Seems like most of this perf regression is caused by me. Working on the _aligned_* shims it was clear that a lot of the calls came from ffmpeg so it would make sense that the regression shows up in media_perftests though I'm puzzled as to why the regression would be so significant. Part of the regression is probably the additional overhead of an indirection due to the introduction of allocator hook, and part of it may be that the current stubs are less efficient than the UCRT implementations. I'll try to run some perf tests locally tomorrow and see if I can figure anything out.
,
Jan 8
These results are very odd. I'm able to reproduce that regression locally but don't yet understand it. In AudioBusPerfTest.Interleave there is only a single call to _aligned_malloc (AudioBus::data_) and it happens outside of the benchmarked code so the regression is not due to the overhead of an allocator shim or my _aligned_*alloc reimplementation. It seems like the memory returned by the new allocator shim is less efficient somehow. The allocations are both aligned exactly the same (0x50 bytes into a page). I tried to warm up the caches by Zero()ing the AudioBus right before the benchmark code but that didn't change the results much. Another odd thing is that only audio_bus_to_interleaved float results regressed, but not the int16_t ones.
,
Jan 11
I wish I had seen this graph earlier https://chromeperf.appspot.com/group_report?sid=a3092ed13b0d2873195c0af5ba53b3f0e2d81957174a1dfb94a8638b456f46f2 The regression disappeared a week after that allocator change landed. Trying to reproduce this now on the most recent master the regression is actually reversed, the shim is now faster than the unshimmed version by an equivalent amount. The only interesting change I see in that batch of changes where it sped up is a clang roll. There was a similar perf increase that happened in early November for this benchmark. I suspected that some compiler layout change could be influencing this benchmark to flap between more/less optimal outputs, but looking at the disassembly of the shimmed/unshimmed versions the code for media::AudioBus::CopyConvertFromAudioBusToInterleavedTarget<float> is identical in both binaries and at the same offsets. Most of the functions are the same and at the same offsets. Next I tried to warm up the icache by calling that function a couple of times before running the benchmark but the perf regression was still there. Adding a sleep for 5 seconds before the benchmark ran seemed equalize the benchmark times but doesn't explain why it's happening. Unfortunately I'm not familiar with Windows perf tools or I'd go further, but this is a very odd regression. Looking at the history of this benchmark the performance flapping has been going on for a while. Given that the regression is no longer occurring I'm going to assign back to tmathmeyer@ to see if he wants to add a sleep() to the perf test or to run this down further. Also given that the GPU isn't used at all, this set of graphs makes no sense to me: https://chromeperf.appspot.com/report?sid=30c6590096bc9283dbb7188933ada04f7ec10b4e6a14736f6bcc08612e8c4e24&start_rev=602673&end_rev=620226 |
|||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Jan 2