New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 800554 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Increase qemu CPU/Memory

Project Member Reported by ihf@chromium.org, Jan 9 2018

Issue description

Right now we run VMTest using 4 cores. I think we could make things faster. Not sure if stable though. Maybe try it locally first a few times with betty/CTS. As we sometimes run 2 instances (moblab), maybe add a command line option to make primary/secondary different size. I don't think we want more than 16 cores though, but it would be interesting to chart the speedup of some runs with different amount of cores.
 

Comment 1 by pwang@chromium.org, Jan 18 2018

Status: Started (was: Untriaged)
Summary: Increase qemu CPU/Memory (was: Increase qemu cores from 4 (to 8 or 12))
CL:872102 and CL:872226 are increasing the qemu core and memory for VM.
Project Member

Comment 2 by bugdroid1@chromium.org, Jan 18 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crosutils/+/304b658451019a781e5e34e4f3a4143a9856f4c2

commit 304b658451019a781e5e34e4f3a4143a9856f4c2
Author: Po-Hsien Wang <pwang@chromium.org>
Date: Thu Jan 18 20:04:01 2018

cros_vm_lib: Increase qemu memory to 8G

We set the vm memory to 4G from 2012. Increase the value of the memory
to 8G by default as 6 years has passed.

BUG= chromium:800554 
TEST=./bin/cros_start_vm --board=betty --kvm_ram=16G --image_path=chromiumos_qemu_image.bin
TEST=./bin/cros_start_vm --board=betty --image_path=chromiumos_qemu_image.bin

Change-Id: I36f470c542bdf68b2a7beca861da17302b5b8a22
Reviewed-on: https://chromium-review.googlesource.com/872226
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Commit-Queue: Po-Hsien Wang <pwang@chromium.org>

[modify] https://crrev.com/304b658451019a781e5e34e4f3a4143a9856f4c2/lib/cros_vm_lib.sh

Comment 3 by ihf@chromium.org, Jan 19 2018

The memory increase dropped running
time test_that --iterations 1 --board=betty localhost:9222 cheets_StartAndroid.stress.0
from 20 minutes to 14 minutes.

Increasing the CPU to 8 seems to be a good choice for the above benchmark. Still confirming details. There are a few more parameters we can set.

We should also watch the VMTest runtime at
http://shortn/_Jd0nWCfloD  - pre-cq
http://shortn/_fhMwhqq1La  - paladin
http://shortn/_cBqEVHi1KP  - release
http://shortn/_nkbmRsaTes  - incremental (cts)

Comment 4 by ihf@chromium.org, Jan 20 2018

I did verify that we can go down from 14 to 12 minutes using 8 cores instead of 4. At 6 it is inbetween, while with 12 or 16 we are still at 12 minutes. So 8 looks like a good number for now.

Different CPU feature sets did nothing to the time.
Project Member

Comment 5 by bugdroid1@chromium.org, Jan 24 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crosutils/+/741e6d853eee8e70e815de9a6e2e7481db888e1d

commit 741e6d853eee8e70e815de9a6e2e7481db888e1d
Author: Po-Hsien Wang <pwang@chromium.org>
Date: Wed Jan 24 01:41:49 2018

cros_vm_lib: Increase qemu cores to 8

Currently we have 32 cores in beefy machine, and typically more than
4 cores in bare metal. Increase the default VM cores to 8 instead of 4
to save some pre-cq time.

BUG= chromium:800554 
TEST=../bin/cros_start_vm --board=betty --kvm_smp=70 --image_path=chromiumos_qemu_image.bin
TEST=../bin/cros_start_vm --board=betty --kvm_smp=8 --image_path=chromiumos_qemu_image.bin

Change-Id: Ib424f87060d29dee7dbc77f23d0221c7253be7b9
Reviewed-on: https://chromium-review.googlesource.com/872102
Commit-Ready: Po-Hsien Wang <pwang@chromium.org>
Tested-by: Po-Hsien Wang <pwang@chromium.org>
Reviewed-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/741e6d853eee8e70e815de9a6e2e7481db888e1d/lib/cros_vm_lib.sh

Comment 6 by ihf@chromium.org, Jan 25 2018

Cc: dgarr...@chromium.org
Status: Verified (was: Started)
The 50th percentile runtime was reduced by 8 minutes from 3300s to 2750s.
The 95th percentile declined 15 minutes from 3900+s to 3000-s.

Most of the improvement is from the extra memory, but this is mostly because smoke suite is very light on CPU usage (so far).
betty_pre-cq_runtime.png
221 KB View Download

Comment 7 by ihf@chromium.org, Jan 25 2018

Cc: norvez@chromium.org
FYI

Comment 8 by norvez@chromium.org, Jan 25 2018

Cc: lepton@chromium.org
+lepton@, FYI

Comment 9 by pwang@chromium.org, Jan 25 2018

https://ci.chromium.org/buildbot/chromeos/betty-vmtest-informational/

Cts on betty often runs ~16hr but now hits 20hr limits. ihf@, is there anything changed in cts?

Comment 10 by lepton@google.com, Jan 25 2018

Have anybody measure the KPTI patch performance impact for CTS?

Comment 11 by ihf@chromium.org, Jan 26 2018

The runtime of CTS is not very meaningful, as it still crashes many times and each crash takes a long time. The increase might be caused by different crashes. I have a hard time to get the whole log from logdog though.

I don't think anybody has measured KPTI. I keep looking for it though. My take is in a situation where we have VM in VM in VM that disk access might get slow. I don't have real apples to apples numbers though.
Project Member

Comment 12 by bugdroid1@chromium.org, Feb 1 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crosutils/+/93c015d5098ff7205d3c3f2b23f134718aba65e4

commit 93c015d5098ff7205d3c3f2b23f134718aba65e4
Author: Ilja H. Friedel <ihf@chromium.org>
Date: Thu Feb 01 10:41:45 2018

Relax and statically verify QEMU cpu features.

Remove pinetrail code which is obsolete. Also do not use qemu64
CPU by default, which has a very limited feature set. Instead
request by default to use the Haswell CPU feature set, which is
the lowest common denominator on Chrome OS infrastructure.

Adding "check" should warn the user when the requested features
are not available locally.

Furthermore add command line option "kvm_cpu" to allow arbitrary
CPU feature set selection.

BUG= chromium:800554 
TEST=Started VM with different options.

Change-Id: I323131fcb56d092fc7608afc389d3e185fb45562
Reviewed-on: https://chromium-review.googlesource.com/877622
Commit-Ready: Ilja H. Friedel <ihf@chromium.org>
Tested-by: Ilja H. Friedel <ihf@chromium.org>
Reviewed-by: Dominik Behr <dbehr@chromium.org>

[modify] https://crrev.com/93c015d5098ff7205d3c3f2b23f134718aba65e4/lib/cros_vm_lib.sh

I'm seeing warnings about unsupported host CPU features on the paladins and canaries:

"
warning: host doesn't support requested feature: CPUID.01H:ECX.fma [bit 12]
warning: host doesn't support requested feature: CPUID.01H:ECX.movbe [bit 22]
warning: host doesn't support requested feature: CPUID.01H:ECX.f16c [bit 29]
warning: host doesn't support requested feature: CPUID.01H:ECX.rdrand [bit 30]
warning: host doesn't support requested feature: CPUID.07H:EBX.fsgsbase [bit 0]
warning: host doesn't support requested feature: CPUID.07H:EBX.bmi1 [bit 3]
warning: host doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5]
warning: host doesn't support requested feature: CPUID.07H:EBX.smep [bit 7]
warning: host doesn't support requested feature: CPUID.07H:EBX.bmi2 [bit 8]
warning: host doesn't support requested feature: CPUID.07H:EBX.erms [bit 9]
warning: host doesn't support requested feature: CPUID.80000001H:ECX.abm [bit 5]
"

For example https://luci-milo.appspot.com/buildbot/chromeos/betty-paladin/2082
Status: Started (was: Verified)
Did some quries to all the pre-cq builders listed in https://luci-milo.appspot.com/buildbot/chromiumos.tryserver/pre_cq/

GCE builders have two set of flags:
set(['fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc eagerfpu pni pclmulqdq vmx ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm xsaveopt tpr_shadow flexpriority ept fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms', 
     'fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm xsaveopt invpcid_single tpr_shadow flexpriority ept fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid'])

Baremetal builders have two set of flags:
set(['fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid',
     'fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms'])

Comment 16 by ihf@chromium.org, Feb 2 2018

Looks like we have to shake out lots of slightly different servers (and find the lowest common denominator).
Cc: ihf@chromium.org vapier@chromium.org pwang@chromium.org achuith@chromium.org dbehr@chromium.org
 Issue 809782  has been merged into this issue.

Comment 18 by ihf@chromium.org, Feb 7 2018

Achuith's dev machine shows same output as #13.

Comment 19 by ihf@chromium.org, Feb 7 2018

In other words Achuith's machine and some builders are still SandyBridge, not Haswell. So lets downgrade to that with a TODO to go back to Haswell once SNB is phased out.
Achuith's machine has similar flags compared to the baremtal builders I posted above.

flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb kaiser tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts
model		: 45
model name	: Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
Could you run `lscpu` and paste the output?

Comment 22 Deleted

Labels: -Pri-3 Pri-2
achuith@achuithz620:~$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  2
Core(s) per socket:  8
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               45
Model name:          Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
Stepping:            7
CPU MHz:             2120.306
CPU max MHz:         3800.0000
CPU min MHz:         1200.0000
BogoMIPS:            5785.72
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            20480K
NUMA node0 CPU(s):   0-7,16-23
NUMA node1 CPU(s):   8-15,24-31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb kaiser tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts

Comment 25 by ihf@chromium.org, Feb 9 2018

Cc: wonderfly@google.com

Sign in to add a comment