New issue
Advanced search Search tips

Issue 630077 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Linux , Mac
Pri: 2
Type: Bug



Sign in to add a comment

CPUID does not check AVX2 correctly on OSX

Reported by fnjo...@gmail.com, Jul 21 2016

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36

Steps to reproduce the problem:
1.  Run base::CPU::has_avx2() on a system with AVX2
2. 
3. 

What is the expected behavior?

What went wrong?
The register ECX is not cleared and thus the call for extended features (EAX=7, ECX=0) is not guaranteed and can frequently fail.

The code for __cpuid() should be replaced with __cpuidex() which takes an additional parameter for the sub-function ID which sets ECX, this should be 0x0 for each CPUID call in the current implementation.  For example:

void
__cpuidex (int cpu_info[4], int function_id, int subfunction_id) {
  __asm__ volatile (
    "mov %%ebx, %%edi\n"
    "cpuid\n"
    "xchg %%edi, %%ebx\n"
    : "=a"(cpu_info[0]), "=D"(cpu_info[1]), "=c"(cpu_info[2]), "=d"(cpu_info[3])
    : "a"(function_id), "c"(subfunction_id)
  );
}

Did this work before? No 

Chrome version: 51.0.2704.103  Channel: stable
OS Version: OS X 10.11.6
Flash Version: Shockwave Flash 22.0 r0

This has been reported for libuv but not directly the Chromium base.  Chipset features can be detected alternatively:

$ sysctl -a | grep machdep.cpu.leaf7_features
machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM FPU_CSDS
 

Comment 1 by ajha@chromium.org, Jul 21 2016

Labels: Te-NeedsFurtherTriage

Comment 2 by mark@chromium.org, Aug 4 2016

Owner: mark@chromium.org
Status: Assigned (was: Unconfirmed)
Thanks for the report. You shouldn’t need those funny ebx machinations with a modern compiler, it should be sufficient to write "cpuid\n" with "=b"(cpu_info[1]).

Can you submit a patch to cpu.cc to correct this? (Don’t name the function __cpuidex.)

Comment 3 by fnjo...@gmail.com, Aug 4 2016

__cpuidex is the MSVC intrinsic function that adds ECX support, so following the existing code profile would suggest adding a __cpuidex implementation.  This is proposed in the patch fix-avx2-detection.patch.

Using #ifdef everywhere instead of compatible intrinsics is proposed in the alternative patch fix-avx2-detection-inline.patch.  This is clunky and requires duplication of the CPUID call parameters thus prone to error on future enhancements.

Wrapping CPUID with a new API and forwarding to the intrinsic or inline assembler is proposed by the last patch fix-avx2-detection-wrapped-asm.patch.  This also brings in changes for _xgetbv which is an Intel recommended intrinsic and also an assembler op.

The MSDN documentation page for CPUID for which the existing code was created -- https://msdn.microsoft.com/en-us/library/hskdteyh(v=vs.80).aspx has updated with support for ECX by replacing the singular argument InfoType with arguments function_id and subfunction_id, the patches also reflects this change -- https://msdn.microsoft.com/en-us/library/hskdteyh(v=vs.140).aspx



fix-avx2-detection.patch
1.8 KB Download
fix-avx2-detection-inline.patch
3.6 KB Download
fix-avx2-detection-wrapped-asm.patch
4.0 KB Download

Comment 4 by mark@chromium.org, Aug 4 2016

The double-underscore names are reserved for the implementation so we shouldn’t be defining names there. It’s fine for MSVC to offer __cpuidex on its own because it IS “the implementation,” but we shouldn’t be offering anything by that name on our own. I think that this is a good opportunity to clean the existing violators up.

I think I’ll use your “wrapped” variant as the starting point, unless you want to own the change and “git cl upload” it for review.

Comment 5 by fnjo...@gmail.com, Aug 5 2016

Thanks, I was trying to avoid a full git clone so I'll leave the change to you.
 Issue 667457  has been merged into this issue.
Labels: OS-Linux

Comment 8 by mark@chromium.org, Jan 3 2017

Cc: mark@chromium.org
Labels: -TE-NeedsFurtherTriage -Via-Wizard
Owner: mtklein@chromium.org
https://codereview.chromium.org/2611683002/
Project Member

Comment 9 by bugdroid1@chromium.org, Jan 3 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c0fe8911a5076819ae903e184078245355d0f1b8

commit c0fe8911a5076819ae903e184078245355d0f1b8
Author: mtklein <mtklein@chromium.org>
Date: Tue Jan 03 18:33:43 2017

Set ecx to 0 when calling cpuid.

This should fix AVX2 detection on Linux and Mac.

In order to read the extended feature flags that we want in cpu_info7, we must
set eax to 7 and ecx to 0.  At head we're setting eax to 7 but leaving ecx
unspecified.  This means we only detect AVX2 support if we get very lucky!

I noticed this when looking at UMA: Window shows a reasonable rate of AVX2,
Linux a miniscule one, and Mac zero.  That made no sense to me, typing this
CL on a Mac that supports AVX2.  :)

This makes base/cpu agree with SkCpu, which we've been using to detect AVX2 for
a good while now, in Chromium and elsewhere.  We use __cpuid_count(7,0,...),
which boils down to the same thing: eax=7, ecx=0.

And... clang format at the presubmit's insistence.

BUG= 630077 

Review-Url: https://codereview.chromium.org/2611683002
Cr-Commit-Position: refs/heads/master@{#441167}

[modify] https://crrev.com/c0fe8911a5076819ae903e184078245355d0f1b8/base/cpu.cc

Status: Fixed (was: Assigned)

Sign in to add a comment