Issue metadata
Sign in to add a comment
|
1.9% regression in smoothness.tough_canvas_cases at 541614:541724 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Mar 12 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/128152e6440000
,
Mar 12 2018
📍 Found significant differences after each of 2 commits. https://pinpoint-dot-chromeperf.appspot.com/job/128152e6440000 Replace WTF::WrapUnique with base::WrapUnique or std::make_unique by lucmult@chromium.org https://chromium.googlesource.com/chromium/src/+/30054a9187f2eafb16fdbaee21a94098afb10753 make SkJumper stages normal Skia code by mtklein@chromium.org https://skia.googlesource.com/skia/+/22e536e3a1a09405d1c0e6f071717a726d86e8d4 Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Mar 14 2018
I think that "make SkJumper normal" CL had two effects on Mac builds of the stages: - frame pointers used to be omitted, and are probably back (I think we might want to keep these for the sake of better crash/ ?) - we're not building with -ffp-contract=fast anymore, so some incidental FMAs are probably missing As for Windows, there's one definite change and one possible change I'm investigating right now: - definite: we had to drop from sysv_abi to vectorcall, which forces us down a less efficient (narrower) pipeline. I'd like to see if I can get sysv_abi to work on clang-cl one day - possible: the #if guards we're using might actually be forcing code that should be using AVX2 down a mostly-AVX path instead, disabling a few things like FMAs, gathers, and hardware float<->half conversion. I plan to follow up on all four of these points here.
,
Mar 14 2018
,
Mar 15 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/670c468885c8ca351af506db4f44d07b2f4e2acb commit 670c468885c8ca351af506db4f44d07b2f4e2acb Author: Mike Klein <mtklein@chromium.org> Date: Thu Mar 15 00:43:04 2018 Allow HSW stages to detect FMAs opportunistically. We used to build the .S files with -ffp-contract=fast for this, and adding it here to the HSW opts will have the equivalent effect. None of the other x86 targets support FMAs. In many key places we do call _mm256_fmadd_ps explicitly, but in several other stages we have been relying on Clang to sniff out opportunities, especially for fused-multiply-subtracts, which we're just not very good at seeing in our heads. Bug: 821012 Change-Id: I2767cbdb7151d60ea7553e83f51b28ff9340eb94 Reviewed-on: https://chromium-review.googlesource.com/963141 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org> Cr-Commit-Position: refs/heads/master@{#543262} [modify] https://crrev.com/670c468885c8ca351af506db4f44d07b2f4e2acb/skia/BUILD.gn
,
Mar 19 2018
That CL doesn't seem to have moved the needle on the Mac bots. But in the meantime I've noticed that we've already thrown these bots under the performance bus by turning off hardware compositing. I guess we can now definitively attribute those regressions to the extra frame pointers, but I'm especially not inclined to try to win that performance back if we're already in limp home mode. That does leave the Windows regressions, which I'm still working on.
,
Mar 21 2018
The concern about #ifdefs not being quite right doesn't look like an issue after all. We're passing /arch:AVX2, which seems to be equivalent to -march=haswell, enabling all the features and #defines we're looking for.
,
Mar 22 2018
I have not been able to compile using sysv_abi with clang-cl. (Though, oddly, it seems to work fine when cross compiling for Windows from Mac.) clang-cl crashes. :/ I'm going to continue working on improving this codebase, but sadly we've ruled out all the current ideas I have about recovering performance here.
,
Mar 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/811d1a139265b7d8c94fc1690184de2351d9230f commit 811d1a139265b7d8c94fc1690184de2351d9230f Author: Mike Klein <mtklein@chromium.org> Date: Tue Mar 27 00:23:29 2018 Revert "Allow HSW stages to detect FMAs opportunistically." This reverts commit 670c468885c8ca351af506db4f44d07b2f4e2acb. Reason for revert: Mysteriously, this has caused some regressions on a bunch of bots that do not support the Haswell instruction set. Reverting just to see if that's real (and because this wasn't really a noticeable win anywhere when landing the first time). Bug: 825647 , 825586 , 825646 , 825104 Original change's description: > Allow HSW stages to detect FMAs opportunistically. > > We used to build the .S files with -ffp-contract=fast for this, and > adding it here to the HSW opts will have the equivalent effect. None of > the other x86 targets support FMAs. > > In many key places we do call _mm256_fmadd_ps explicitly, but in several > other stages we have been relying on Clang to sniff out opportunities, > especially for fused-multiply-subtracts, which we're just not very good > at seeing in our heads. > > Bug: 821012 > Change-Id: I2767cbdb7151d60ea7553e83f51b28ff9340eb94 > Reviewed-on: https://chromium-review.googlesource.com/963141 > Reviewed-by: Florin Malita <fmalita@chromium.org> > Commit-Queue: Mike Klein <mtklein@chromium.org> > Cr-Commit-Position: refs/heads/master@{#543262} TBR=fmalita@chromium.org,herb@chromium.org,mtklein@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 821012 Change-Id: I61c0627270853b8acdcf6b20fca17ba0fd8bba55 Reviewed-on: https://chromium-review.googlesource.com/980293 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Mike Klein <mtklein@chromium.org> Cr-Commit-Position: refs/heads/master@{#545897} [modify] https://crrev.com/811d1a139265b7d8c94fc1690184de2351d9230f/skia/BUILD.gn |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Mar 12 2018