New issue
Advanced search Search tips

Issue 848488 link

Starred by 9 users

Issue metadata

Status: Fixed
Owner:
Closed: Sep 10
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Unimplemented CPUID leaf 0x80000006 causes numpy/openblas to spin CPU forever

Reported by nelh...@nelhage.com, May 31 2018

Issue description

UserAgent: Mozilla/5.0 (X11; CrOS x86_64 10718.4.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.4 Safari/537.36
Platform: 10718.4.0 (Official Build) dev-channel eve

Steps to reproduce the problem:
Install numpy inside the Termina environment and attempt to invert a matrix:

$ sudo apt install python-numpy
$ python -c 'import numpy as np; np.linalg.inv(np.identity(3))'

What is the expected behavior?
`numpy` functions properly and the `python -c` returns quickly.

What went wrong?
`np.linalg.inv` hangs forever.

Did this work before? N/A 

Chrome version: 68.0.3440.4  Channel: dev
OS Version: 10718.4.0
Flash Version: 

I traced the hang, and the short story is that openblas looks up the L2 cache size on startup to pick a stride for matrix operations using CPUID:

https://github.com/xianyi/OpenBLAS/blob/1a49fb1c05b19f5bcfed1922cba9b28fed6d68c2/kernel/setparam-ref.c#L654-L671

CrosVM returns all-0s for that CPUID leaf:

[nelhage@penguin:~]$ cpuid -r -l 0x80000006
CPU 0:
   0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
CPU 1:
   0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
CPU 2:
   0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
CPU 3:
   0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000

and openblas does not check for errors and happily tries to loop over input with a stride length of 0, resulting in infinite loops in most operations.
 

Comment 1 by nelh...@nelhage.com, May 31 2018

fwiw it's been a while since I referenced the Intel manuals in too much detail, but     I note that CPUID 0x80000000 returns eax=0x80000008, suggesting that 0x80000006 ought be supported:

[nelhage@penguin:~]$ cpuid -1 -r -l 0x80000000
CPU:
   0x80000000 0x00: eax=0x80000008 ebx=0x00000000 ecx=0x00000000 edx=0x00000000

It's possible that openblas is technically in the wrong here and I intend to file a bug with them as well, but given that they presumably work in ~every other modern environment I suspect this of being primarily a Crostini bug.

Components: OS>Systems>Containers
Summary: Unimplemented CPUID leaf 0x80000006 causes numpy/openblas to spin CPU forever (was: 10718.4.0 (Official Build) dev-channel eve)
(Amending subject at nelhage's request.)
Cc: dgreid@chromium.org sonnyrao@chromium.org za...@chromium.org smbar...@chromium.org
Labels: Proj-Containers
Probably missing this from cpuid.rs
Related: https://www.spinics.net/lists/kvm/msg169690.html

Which seems to have been filed by another Google engineer just before I reported this ticket
Labels: Hotlist-Crostini-Platform
Owner: dgreid@chromium.org
Status: Assigned (was: Unconfirmed)
Status: Fixed (was: Assigned)
slava fixed this before he left, just verified numpy now completes quickly.

Sign in to add a comment