AVX2 is not detected correctly on Windows. |
|||||
Issue descriptionThe __cpuid() from clang-cl/msvc does not return the information necessary to check for AVX2 support. You must use __cpuidex(). https://docs.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex Found after discussion with libaom and dav1d teams on why our AVX2 rates are so low on Windows... especially for a feature from 2013. The raw cpuid opcode from other platforms works correctly AFAICT.
,
Nov 2
😮
,
Nov 2
I can't find anyone using this code, so I think this just affects the UMA.
,
Nov 2
😑
,
Nov 2
libyuv, libaom, skia, and libvpx all use the correct __cpuidex, ffmpeg always uses the raw cpuid opcode. So everyone but base/cpu.cc seems to be doing it correct and no one calls has_avx2 per code search.
,
Nov 2
CrOS is looking suspicious too, so maybe something is going on there too: https://uma.googleplex.com/p/chrome/timeline_v2/?sid=99837e62288561a691d5b78b4e6d4fd4 Mac, Linux both look good though.
,
Nov 2
+dgreid who might know about avx2 status on Chromebooks or know who to route to; maybe it's disabled at OS layer?
,
Nov 2
Hmm, looks correct on my x86 Chromebook, so maybe x86 devices are rare or it's disabled on many.
,
Nov 3
I wouldn't be surprised if the avx2 usage across all of chrome OS is low. It's not supported in Braswell or BayTrail which are the most common devices by Far.
,
Nov 3
"The __cpuid() from clang-cl/msvc does not return" means that clang-cl and msvc both have this behavior, yes? Do you have a small code snippet that has different behavior on clang-cl and clang?
,
Nov 5
Actually, it looks like this might be a clang-cl bug. Here are the results for a toy program: ASUS ZenBook: $ cpuid-test-clang-cl.exe __cpuid(7) = 00000000 00000000 00000000 00000000 __cpuidex(7,0) = 00000000 021c27ab 00000000 0c000000 $ cpuid-test-cl.exe __cpuid(7) = 00000000 021c27ab 00000000 0c000000 __cpuidex(7,0) = 00000000 021c27ab 00000000 0c000000 Z420 machine: $ cpuid-test-clang-cl.exe __cpuid(7) = 00000000 00000000 00000000 00000000 __cpuidex(7,0) = 00000000 00000281 00000000 00000000 $ cpuid-test-cl.exe __cpuid(7) = 00000000 00000281 00000000 00000000 __cpuidex(7,0) = 00000000 00000281 00000000 00000000 It looks like the clang-cl __cpuid() isn't working the same as msvc's CL (using latest Chrome toolchain versions)
,
Nov 6
rnk: dalecurtis says that clang-cl's __cpuid() doesn't work. Repro program:
#include <cstring>
#include <intrin.h>
#include <stdio.h>
int main(void) {
int cpu_info7[4] = {0};
__cpuid(cpu_info7, 7);
printf("__cpuid(7) = %08x %08x %08x %08x\n", cpu_info7[0], cpu_info7[1],
cpu_info7[2], cpu_info7[3]);
memset(cpu_info7, 0, sizeof(cpu_info7));
__cpuidex(cpu_info7, 7, 0);
printf("__cpuidex(7,0) = %08x %08x %08x %08x\n", cpu_info7[0], cpu_info7[1],
cpu_info7[2], cpu_info7[3]);
return 0;
}
ASUS ZenBook:
$ cpuid-test-clang-cl.exe
__cpuid(7) = 00000000 00000000 00000000 00000000
__cpuidex(7,0) = 00000000 021c27ab 00000000 0c000000
$ cpuid-test-cl.exe
__cpuid(7) = 00000000 021c27ab 00000000 0c000000
__cpuidex(7,0) = 00000000 021c27ab 00000000 0c000000
,
Nov 6
Yeah, looks like we need to zero out ECX. I guess that was some new requirement introduced when AVX came out in 2013. The history is like this: - r186696, 2013-07-19, Roman Divaky commits cpuid.h for John Baldwin, implements __cpuid for GCC compatibility - r199439, 2014-01-16, Hans Wennborg copies the (presumed to be correct) __cpuid implementation over to intrin.h for MSVC compatibility I guess we based our implementation on one that predates AVX. cpuid is such an old instruction, we never suspected the "API" would change and the old implementation would become incorrect. We can get a fix in the next clang roll, but I see you are using __cpuidex as a workaround.
,
Nov 6
Do you recommend landing that workaround or should we just wait for the clang roll? It's not a huge deal. I do see that every other MSVC cpu flags test in Chromium is already using cpuidex, whether or not they were aware of this issue -- so it seems a fine patch to land.
,
Nov 6
I was thinking would be best to land the __cpuidex change separately so that it could be reverted independently from the clang roll if it causes breakage. However, there's kind of no going back once we roll in the fix, since going back to __cpuid will still enable AVX2. I think we're going to try to make a roll this week, so let's just wait for that. If there's breakage, we'll need to come up with a new patch to turn off AVX2 until the issues are debugged.
,
Nov 6
This got fixed upstream with r346265: $ clang-cl -O1 t.cpp -Fot.obj -Fet.exe && ./t.exe __cpuid(7) = 00000000 021cbfbb 00000000 00000000 __cpuidex(7,0) = 00000000 021cbfbb 00000000 00000000 The chromium.clang waterfall isn't healthy right now (https://ci.chromium.org/p/chromium/g/chromium.clang/console), so let's wait until tomorrow and see if we can make packages.
,
Nov 14
Fixed, thanks! https://uma.googleplex.com/p/chrome/timeline_v2/?sid=3bdbe56b4a520b8bdd2e230667f6dd90 |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by dalecur...@chromium.org
, Nov 2