CPUID does not check AVX2 correctly on OSX
Reported by
fnjo...@gmail.com,
Jul 21 2016
|
|||||
Issue description
UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36
Steps to reproduce the problem:
1. Run base::CPU::has_avx2() on a system with AVX2
2.
3.
What is the expected behavior?
What went wrong?
The register ECX is not cleared and thus the call for extended features (EAX=7, ECX=0) is not guaranteed and can frequently fail.
The code for __cpuid() should be replaced with __cpuidex() which takes an additional parameter for the sub-function ID which sets ECX, this should be 0x0 for each CPUID call in the current implementation. For example:
void
__cpuidex (int cpu_info[4], int function_id, int subfunction_id) {
__asm__ volatile (
"mov %%ebx, %%edi\n"
"cpuid\n"
"xchg %%edi, %%ebx\n"
: "=a"(cpu_info[0]), "=D"(cpu_info[1]), "=c"(cpu_info[2]), "=d"(cpu_info[3])
: "a"(function_id), "c"(subfunction_id)
);
}
Did this work before? No
Chrome version: 51.0.2704.103 Channel: stable
OS Version: OS X 10.11.6
Flash Version: Shockwave Flash 22.0 r0
This has been reported for libuv but not directly the Chromium base. Chipset features can be detected alternatively:
$ sysctl -a | grep machdep.cpu.leaf7_features
machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM FPU_CSDS
,
Aug 4 2016
Thanks for the report. You shouldn’t need those funny ebx machinations with a modern compiler, it should be sufficient to write "cpuid\n" with "=b"(cpu_info[1]). Can you submit a patch to cpu.cc to correct this? (Don’t name the function __cpuidex.)
,
Aug 4 2016
__cpuidex is the MSVC intrinsic function that adds ECX support, so following the existing code profile would suggest adding a __cpuidex implementation. This is proposed in the patch fix-avx2-detection.patch. Using #ifdef everywhere instead of compatible intrinsics is proposed in the alternative patch fix-avx2-detection-inline.patch. This is clunky and requires duplication of the CPUID call parameters thus prone to error on future enhancements. Wrapping CPUID with a new API and forwarding to the intrinsic or inline assembler is proposed by the last patch fix-avx2-detection-wrapped-asm.patch. This also brings in changes for _xgetbv which is an Intel recommended intrinsic and also an assembler op. The MSDN documentation page for CPUID for which the existing code was created -- https://msdn.microsoft.com/en-us/library/hskdteyh(v=vs.80).aspx has updated with support for ECX by replacing the singular argument InfoType with arguments function_id and subfunction_id, the patches also reflects this change -- https://msdn.microsoft.com/en-us/library/hskdteyh(v=vs.140).aspx
,
Aug 4 2016
The double-underscore names are reserved for the implementation so we shouldn’t be defining names there. It’s fine for MSVC to offer __cpuidex on its own because it IS “the implementation,” but we shouldn’t be offering anything by that name on our own. I think that this is a good opportunity to clean the existing violators up. I think I’ll use your “wrapped” variant as the starting point, unless you want to own the change and “git cl upload” it for review.
,
Aug 5 2016
Thanks, I was trying to avoid a full git clone so I'll leave the change to you.
,
Jan 3 2017
Issue 667457 has been merged into this issue.
,
Jan 3 2017
,
Jan 3 2017
https://codereview.chromium.org/2611683002/
,
Jan 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c0fe8911a5076819ae903e184078245355d0f1b8 commit c0fe8911a5076819ae903e184078245355d0f1b8 Author: mtklein <mtklein@chromium.org> Date: Tue Jan 03 18:33:43 2017 Set ecx to 0 when calling cpuid. This should fix AVX2 detection on Linux and Mac. In order to read the extended feature flags that we want in cpu_info7, we must set eax to 7 and ecx to 0. At head we're setting eax to 7 but leaving ecx unspecified. This means we only detect AVX2 support if we get very lucky! I noticed this when looking at UMA: Window shows a reasonable rate of AVX2, Linux a miniscule one, and Mac zero. That made no sense to me, typing this CL on a Mac that supports AVX2. :) This makes base/cpu agree with SkCpu, which we've been using to detect AVX2 for a good while now, in Chromium and elsewhere. We use __cpuid_count(7,0,...), which boils down to the same thing: eax=7, ecx=0. And... clang format at the presubmit's insistence. BUG= 630077 Review-Url: https://codereview.chromium.org/2611683002 Cr-Commit-Position: refs/heads/master@{#441167} [modify] https://crrev.com/c0fe8911a5076819ae903e184078245355d0f1b8/base/cpu.cc
,
Jan 3 2017
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by ajha@chromium.org
, Jul 21 2016