New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 685331 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug



Sign in to add a comment

Crashes in intel_pmu_lbr_read

Project Member Reported by groeck@chromium.org, Jan 25 2017

Issue description

Crash reporter reports lots of crashes in intel_pmu_lbr_read.

<1>[ 1142.079575] BUG: unable to handle kernel NULL pointer dereference at 0000000000000019
<1>[ 1142.079600] IP: [<ffffffff87c17f08>] intel_pmu_lbr_read+0x112/0x3bd
<4>[ 1142.079620] PGD 0
<4>[ 1142.079627] Oops: 0000 [#1] PREEMPT SMP
<0>[ 1142.080012] gsmi: Log Shutdown Reason 0x03
<4>[ 1142.080012] Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6table_mangle xt_TCPMSS veth uinput snd_soc_sst_cht_bsw_rt5645 memconsole_x86_legacy memconsole snd_hda_codec_hdmi snd_hda_intel snd_intel_sst_acpi snd_soc_sst_acpi snd_hda_codec snd_hwdep snd_hda_core snd_intel_sst_core snd_soc_rt5645 snd_soc_sst_mfld_platform snd_soc_rl6231 ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat rfcomm xt_mark bridge stp llc fuse zram ccm ip6table_filter snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device iwlmvm iwlwifi iwl7000_mac80211 cfg80211 btusb btrtl btbcm btintel bluetooth uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core joydev
<4>[ 1142.080012] CPU: 1 PID: 9596 Comm: sleep Tainted: G W 3.18.0-13733-g6a96b8f #1
<4>[ 1142.080012] Hardware name: GOOGLE Celes, BIOS Google_Celes.7287.92.60 11/06/2016
<4>[ 1142.080012] task: ffff880004dee460 ti: ffff880002ce4000 task.ti: ffff880002ce4000
<4>[ 1142.080012] RIP: 0010:[<ffffffff87c17f08>] [<ffffffff87c17f08>] intel_pmu_lbr_read+0x112/0x3bd
<4>[ 1142.080012] RSP: 0018:ffff88007cb05c10 EFLAGS: 00010006
<4>[ 1142.080012] RAX: 0000000000000000 RBX: ffff88007cb0ab40 RCX: 00000000000001c9
<4>[ 1142.080012] RDX: 0000000000000005 RSI: ffff88007cb0ab40 RDI: 0000000000000340
<4>[ 1142.080012] RBP: ffff88007cb05c70 R08: 0000000000000000 R09: ffffffff885a78ec
<4>[ 1142.080012] R10: 0000000000000007 R11: 0000000000000005 R12: 0000000000000005
<4>[ 1142.080012] R13: 0000000000000008 R14: 00000109e9507f00 R15: 0000000000000001
<4>[ 1142.080012] FS: 0000000000000000(0000) GS:ffff88007cb00000(0000) knlGS:0000000000000000
<4>[ 1142.080012] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[ 1142.080012] CR2: 0000000000000019 CR3: 0000000002a8a000 CR4: 00000000001007e0
<4>[ 1142.080012] Stack:
<4>[ 1142.080012] 000000000000ab40 ffff88007cb05c80 0000000000000007 00000000ffffffff
<4>[ 1142.080012] ffff880000000001 0000000000000005 ffff88007cb0ab40 000000000000ab40
<4>[ 1142.080012] ffff88007cb0ab40 0000000000000000 00000109e9507f34 ffff88007cb0b290
<4>[ 1142.080012] Call Trace:
<4>[ 1142.080012] <NMI>
<4>[ 1142.080012] [<ffffffff87c1acb4>] intel_pmu_handle_irq+0xe3/0x3fd
<4>[ 1142.080012] [<ffffffff87c13e9c>] perf_event_nmi_handler+0x25/0x3e
<4>[ 1142.080012] [<ffffffff87c0a463>] ? native_sched_clock+0x3a/0x3c
<4>[ 1142.080012] [<ffffffff87c13e9c>] ? perf_event_nmi_handler+0x25/0x3e
<4>[ 1142.080012] [<ffffffff87c06010>] nmi_handle+0x6c/0x13a
<4>[ 1142.080012] [<ffffffff87c2d35a>] ? cpumask_clear_cpu.constprop.1+0x11/0x11
<4>[ 1142.080012] [<ffffffff87c061bb>] do_nmi+0xdd/0x31a
<4>[ 1142.080012] [<ffffffff882a7a75>] end_repeat_nmi+0x1a/0x1e
<4>[ 1142.080012] [<ffffffff87d17b82>] ? __do_page_cache_readahead+0xc5/0x23a
<4>[ 1142.080012] [<ffffffff87d17b82>] ? __do_page_cache_readahead+0xc5/0x23a
<4>[ 1142.080012] [<ffffffff87d17b82>] ? __do_page_cache_readahead+0xc5/0x23a
<4>[ 1142.080012] <<EOE>>
<4>[ 1142.080012] [<ffffffff87d0eae1>] ? pagecache_get_page+0x32/0x15d
<4>[ 1142.080012] [<ffffffff87d102a9>] filemap_fault+0x1bc/0x3be
<4>[ 1142.080012] [<ffffffff87d2fa3d>] __do_fault+0x48/0x85
<4>[ 1142.080012] [<ffffffff87d2c08c>] ? vma_interval_tree_augment_rotate+0x4d/0x4d
<4>[ 1142.080012] [<ffffffff87d314d5>] do_cow_fault.isra.83+0x7d/0x155
<4>[ 1142.080012] [<ffffffff87d332b0>] handle_mm_fault+0x227/0x76b
<4>[ 1142.080012] [<ffffffff87c36550>] __do_page_fault+0x29c/0x396
<4>[ 1142.080012] [<ffffffff87d05c19>] ? perf_event_aux+0xe0/0x111
<4>[ 1142.080012] [<ffffffff87d0a509>] ? perf_event_mmap+0x282/0x2a3
<4>[ 1142.080012] [<ffffffff87c3667c>] do_page_fault+0xc/0xe
<4>[ 1142.080012] [<ffffffff882a7702>] page_fault+0x22/0x30
<4>[ 1142.080012] [<ffffffff87e7f478>] ? __clear_user+0x36/0x5b
<4>[ 1142.080012] [<ffffffff87e7f459>] ? __clear_user+0x17/0x5b
<4>[ 1142.080012] [<ffffffff87e7f4cc>] clear_user+0x2f/0x31
<4>[ 1142.080012] [<ffffffff87d95787>] load_elf_binary+0x8ce/0x1602
<4>[ 1142.080012] [<ffffffff87d5722f>] search_binary_handler+0x86/0x191
<4>[ 1142.080012] [<ffffffff87d93bdc>] load_script+0x1ca/0x1ec
<4>[ 1142.080012] [<ffffffff87d57762>] ? copy_strings.isra.18+0x1ed/0x2fd
<4>[ 1142.080012] [<ffffffff87d5722f>] search_binary_handler+0x86/0x191
<4>[ 1142.080012] [<ffffffff87d586a6>] do_execve_common.isra.26+0x412/0x622
<4>[ 1142.080012] [<ffffffff87d588ce>] do_execve+0x18/0x1a
<4>[ 1142.080012] [<ffffffff87d58b55>] SyS_execve+0x2a/0x2e
<4>[ 1142.080012] [<ffffffff882a6279>] stub_execve+0x69/0xa0
<4>[ 1142.080012] Code: 0f b6 c2 89 45 c0 0f 32 41 89 c3 48 8b 83 58 0a 00 00 48 c1 e2 20 4c 09 da 4c 63 7d c0 44 8a 35 5f ab cd 00 48 89 55 c8 49 89 d4 <f6> 40 19 02 8b 05 26 ab cd 00 44 0f 45 6d c8 45 31 d2 89 45 b8
<1>[ 1142.080012] RIP [<ffffffff87c17f08>] intel_pmu_lbr_read+0x112/0x3bd
<4>[ 1142.080012] RSP <ffff88007cb05c10>
<4>[ 1142.080012] CR2: 0000000000000019
<4>[ 1142.080012] ---[ end trace 97765f049d5f2f0d ]---
<1>[ 1142.080428] BUG: unable to handle kernel NULL pointer dereference at 0000000000000019
<1>[ 1142.080428] IP: [<ffffffff87c17f08>] intel_pmu_lbr_read+0x112/0x3bd
<4>[ 1142.080428] PGD 0
<4>[ 1142.080428] Oops: 0000 [#2] PREEMPT SMP

Seen on Quawks (BayTrail), Squawks (BayTrail), Celes (Braswell), Reks (Braswell), Setzer (Braswell), and possibly others.

 

Comment 1 by groeck@chromium.org, Jan 25 2017

Status: Started (was: Assigned)

Comment 2 by groeck@chromium.org, Jan 25 2017

Labels: Kernel-4.4 Kernel-3.18

Comment 3 by groeck@chromium.org, Jan 25 2017

Also Banon (Braswell), Terra (Braswell), 
Project Member

Comment 4 by bugdroid1@chromium.org, Jan 26 2017

Labels: merge-merged-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3a12e288cc4d6d661c9e3126262d50be2e6c4567

commit 3a12e288cc4d6d661c9e3126262d50be2e6c4567
Author: Kan Liang <kan.liang@intel.com>
Date: Mon Aug 17 12:37:31 2015

UPSTREAM: perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI

This patch fixes an issue which introduced by commit
1a78d93750bb5f61abdc59a91fc3bd06a214542a ("perf/x86/intel: Streamline
LBR MSR handling in PMI").

The old patch not only avoids writing LBR_SELECT MSR in PMI, but also
avoids updating lbr_select variable. So in PMI, FREEZE_LBRS_ON_PMI bit
is always mistakenly set for IA32_DEBUGCTLMSR MSR, which causes
superfluous increase/decrease of LBR_TOS when collecting LBR callstack.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I2e719075437318b7e1fef9ae481492f8d6008feb
Reported-by: Milian Wolff <mail@milianw.de>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1439815051-8616-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit deb27519bf1f)
Reviewed-on: https://chromium-review.googlesource.com/432937
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/3a12e288cc4d6d661c9e3126262d50be2e6c4567/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 5 by bugdroid1@chromium.org, Jan 26 2017

Labels: merge-merged-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3a12e288cc4d6d661c9e3126262d50be2e6c4567

commit 3a12e288cc4d6d661c9e3126262d50be2e6c4567
Author: Kan Liang <kan.liang@intel.com>
Date: Mon Aug 17 12:37:31 2015

UPSTREAM: perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI

This patch fixes an issue which introduced by commit
1a78d93750bb5f61abdc59a91fc3bd06a214542a ("perf/x86/intel: Streamline
LBR MSR handling in PMI").

The old patch not only avoids writing LBR_SELECT MSR in PMI, but also
avoids updating lbr_select variable. So in PMI, FREEZE_LBRS_ON_PMI bit
is always mistakenly set for IA32_DEBUGCTLMSR MSR, which causes
superfluous increase/decrease of LBR_TOS when collecting LBR callstack.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I2e719075437318b7e1fef9ae481492f8d6008feb
Reported-by: Milian Wolff <mail@milianw.de>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1439815051-8616-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit deb27519bf1f)
Reviewed-on: https://chromium-review.googlesource.com/432937
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/3a12e288cc4d6d661c9e3126262d50be2e6c4567/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 6 by bugdroid1@chromium.org, Jan 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/092e2d51cd180bb4205fe35c615f4a9061271110

commit 092e2d51cd180bb4205fe35c615f4a9061271110
Author: Kan Liang <kan.liang@intel.com>
Date: Mon Sep 14 14:14:07 2015

UPSTREAM: perf/x86/intel: Fix static checker warning in lbr enable

Commit deb27519bf1f ("perf/x86/intel: Fix LBR callstack issue caused
by FREEZE_LBRS_ON_PMI") leads to the following Smatch complaint:

   warn: variable dereferenced before check 'cpuc->lbr_sel' (see line 154)

Fix the warning.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I6b470e7d1cdaf306cc7a5dd7cc479ccadabbf208
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: deb27519bf1f ("perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI")
Link: http://lkml.kernel.org/r/1442240047-48149-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 96f3eda67fcf)
Reviewed-on: https://chromium-review.googlesource.com/432938
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/092e2d51cd180bb4205fe35c615f4a9061271110/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 7 by bugdroid1@chromium.org, Jan 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/4ee1f38a42642c12953489b092daff6c10f5bbf1

commit 4ee1f38a42642c12953489b092daff6c10f5bbf1
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:33 2015

UPSTREAM: perf/x86: Fix LBR call stack save/restore

This fixes a bug I added in the following commit:

  90405aa02247 ("perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode")

The bug could lead to lost LBR call stacks. When restoring the LBR state
we need to use the TOS of the previous context, not the current context.
To do that we need to save/restore the TOS.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Iaa7ede615fd6fcf5e14467371934d9b1c01eaa37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b28ae9560b69)
Reviewed-on: https://chromium-review.googlesource.com/432939
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/4ee1f38a42642c12953489b092daff6c10f5bbf1/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/4ee1f38a42642c12953489b092daff6c10f5bbf1/arch/x86/kernel/cpu/perf_event.h

Project Member

Comment 8 by bugdroid1@chromium.org, Jan 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/fb4a36dec0dd562139f2a983bcccf04369cade07

commit fb4a36dec0dd562139f2a983bcccf04369cade07
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:34 2015

UPSTREAM: perf/x86: Add option to disable reading branch flags/cycles

With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is
not free anymore, as it has moved to a separate MSR.

For callstack mode we don't need any of this information; so we can
avoid the unnecessary MSR read. Add flags to the perf interface where
perf record can request not collecting this information.

Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual
for branch_sample_types to be negative (disable), not positive (enable),
but since the legacy ABI reported the flags we need some form of
explicit disabling to avoid breaking the ABI.

After we have the flags the x86 perf code can keep track if any users
need the flags. If noone needs it the information is not collected.

This cuts down the cost of LBR callstack on Skylake significantly.
Profiling a kernel build with LBR call stack the average run time of
the PMI handler drops by 43%.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Ibd482059493dd522063c907b170a5bab4454db37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b16a5b52eb90)
Reviewed-on: https://chromium-review.googlesource.com/432940
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/fb4a36dec0dd562139f2a983bcccf04369cade07/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/fb4a36dec0dd562139f2a983bcccf04369cade07/include/uapi/linux/perf_event.h

Project Member

Comment 9 by bugdroid1@chromium.org, Jan 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/4dfe943811910cc89dddf45613ea87011b5fab67

commit 4dfe943811910cc89dddf45613ea87011b5fab67
Author: Stephane Eranian <eranian@google.com>
Date: Thu Dec 03 22:33:17 2015

UPSTREAM: perf/x86: Fix LBR related crashes on Intel Atom

This patches fixes the LBR kernel crashes on Intel Atom.

The kernel was assuming that if the CPU supports 64-bit format
LBR, then it has an LBR_SELECT MSR. Atom uses 64-bit LBR format
but does not have LBR_SELECT. That was causing NULL pointer
dereferences in a couple of places.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I830effd68372fd1c74bd6a63bd68f44993952de9
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Fixes: 96f3eda67fcf ("perf/x86/intel: Fix static checker warning in lbr enable")
Link: http://lkml.kernel.org/r/1449182000-31524-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 6fc2e83077b0)
Reviewed-on: https://chromium-review.googlesource.com/432941
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/4dfe943811910cc89dddf45613ea87011b5fab67/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Cc: gmx@chromium.org sonnyrao@chromium.org
+gmx as FYI
Thanks to Guenter for noticing and fixing!

Comment 11 by gmx@google.com, Jan 26 2017

Yes, thanks for noticing and fixing.

I enabled LBR stack profiling for Silvermont and Airmont devices last week when I enabled it for Skylake, because the kernel seemed to have PMU support for these architectures, but I only tested on Skylake.

I wonder if this impacts older kernels as well? There are a number of BayTrail devices on kernel 3.10.
re #11 -- we don't have LBR on 3.10 do we?  if not then it shouldn't be an issue there
Labels: -M-56
#11: Guess that explains why I waw the problem only in recent builds. I assume we'll need the fix in R57, but not R56. Is that correct ?

Also, I am not sure about 3.10 (or 3.14) - I have not seen the crash there. That doesn't mean much, of course, if the problem was just recently introduced.

Comment 15 by gmx@google.com, Jan 26 2017

Yes, this is only in R57.

Re: older kernels, it may very well be possible that the bug fixed by "perf/x86: Fix LBR related crashes on Intel Atom", was exposed by one of the commits to 3.18 for enabling LBR callstack support.

Older kernels don't have LBR callstack support, but they support LBR profiling.

Project Member

Comment 16 by bugdroid1@chromium.org, Jan 26 2017

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/5ebc9190ef5bf6a4c06c293f66ec320592c78f5f

commit 5ebc9190ef5bf6a4c06c293f66ec320592c78f5f
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:34 2015

UPSTREAM: perf/x86: Add option to disable reading branch flags/cycles

With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is
not free anymore, as it has moved to a separate MSR.

For callstack mode we don't need any of this information; so we can
avoid the unnecessary MSR read. Add flags to the perf interface where
perf record can request not collecting this information.

Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual
for branch_sample_types to be negative (disable), not positive (enable),
but since the legacy ABI reported the flags we need some form of
explicit disabling to avoid breaking the ABI.

After we have the flags the x86 perf code can keep track if any users
need the flags. If noone needs it the information is not collected.

This cuts down the cost of LBR callstack on Skylake significantly.
Profiling a kernel build with LBR call stack the average run time of
the PMI handler drops by 43%.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Ibd482059493dd522063c907b170a5bab4454db37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b16a5b52eb90)
Reviewed-on: https://chromium-review.googlesource.com/432940
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit fb4a36dec0dd562139f2a983bcccf04369cade07)
Reviewed-on: https://chromium-review.googlesource.com/433340

[modify] https://crrev.com/5ebc9190ef5bf6a4c06c293f66ec320592c78f5f/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/5ebc9190ef5bf6a4c06c293f66ec320592c78f5f/include/uapi/linux/perf_event.h

Owner: groeck@chromium.org
Project Member

Comment 18 by bugdroid1@chromium.org, Jan 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/8c2d19165d6da7be00567426a3b891f937499bf2

commit 8c2d19165d6da7be00567426a3b891f937499bf2
Author: Stephane Eranian <eranian@google.com>
Date: Thu Dec 03 22:33:17 2015

UPSTREAM: perf/x86: Fix LBR related crashes on Intel Atom

This patches fixes the LBR kernel crashes on Intel Atom.

The kernel was assuming that if the CPU supports 64-bit format
LBR, then it has an LBR_SELECT MSR. Atom uses 64-bit LBR format
but does not have LBR_SELECT. That was causing NULL pointer
dereferences in a couple of places.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I830effd68372fd1c74bd6a63bd68f44993952de9
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Fixes: 96f3eda67fcf ("perf/x86/intel: Fix static checker warning in lbr enable")
Link: http://lkml.kernel.org/r/1449182000-31524-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 6fc2e83077b0)
Reviewed-on: https://chromium-review.googlesource.com/432941
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 4dfe943811910cc89dddf45613ea87011b5fab67)
Reviewed-on: https://chromium-review.googlesource.com/433356
Reviewed-by: Gabriel Marin <gmx@chromium.org>

[modify] https://crrev.com/8c2d19165d6da7be00567426a3b891f937499bf2/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Labels: Merge-Request-57
Status: Fixed (was: Started)
Project Member

Comment 20 by sheriffbot@chromium.org, Jan 30 2017

Labels: -Merge-Request-57 Hotlist-Merge-Approved Merge-Approved-57
Your change meets the bar and is auto-approved for M57. Please go ahead and merge the CL to branch 2987 manually. Please contact milestone owner if you have questions.
Owners: amineer@(clank), cmasso@(bling), ketakid@(cros), govind@(desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Please merge  your change to M57 branch 2987 ASAP.If merge happens today before 5:00 PM PT, then we can take it for tomorrow's last M57 Dev release. Thank you.
Project Member

Comment 22 by bugdroid1@chromium.org, Jan 30 2017

Labels: merge-merged-release-R57-9202.B-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/1afd331ffbec264cb42ff6c140317bc8e2a81278

commit 1afd331ffbec264cb42ff6c140317bc8e2a81278
Author: Kan Liang <kan.liang@intel.com>
Date: Mon Aug 17 12:37:31 2015

UPSTREAM: perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI

This patch fixes an issue which introduced by commit
1a78d93750bb5f61abdc59a91fc3bd06a214542a ("perf/x86/intel: Streamline
LBR MSR handling in PMI").

The old patch not only avoids writing LBR_SELECT MSR in PMI, but also
avoids updating lbr_select variable. So in PMI, FREEZE_LBRS_ON_PMI bit
is always mistakenly set for IA32_DEBUGCTLMSR MSR, which causes
superfluous increase/decrease of LBR_TOS when collecting LBR callstack.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I2e719075437318b7e1fef9ae481492f8d6008feb
Reported-by: Milian Wolff <mail@milianw.de>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1439815051-8616-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit deb27519bf1f)
Reviewed-on: https://chromium-review.googlesource.com/432937
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 3a12e288cc4d6d661c9e3126262d50be2e6c4567)
Reviewed-on: https://chromium-review.googlesource.com/434503

[modify] https://crrev.com/1afd331ffbec264cb42ff6c140317bc8e2a81278/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 23 by bugdroid1@chromium.org, Jan 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/1afd331ffbec264cb42ff6c140317bc8e2a81278

commit 1afd331ffbec264cb42ff6c140317bc8e2a81278
Author: Kan Liang <kan.liang@intel.com>
Date: Mon Aug 17 12:37:31 2015

UPSTREAM: perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI

This patch fixes an issue which introduced by commit
1a78d93750bb5f61abdc59a91fc3bd06a214542a ("perf/x86/intel: Streamline
LBR MSR handling in PMI").

The old patch not only avoids writing LBR_SELECT MSR in PMI, but also
avoids updating lbr_select variable. So in PMI, FREEZE_LBRS_ON_PMI bit
is always mistakenly set for IA32_DEBUGCTLMSR MSR, which causes
superfluous increase/decrease of LBR_TOS when collecting LBR callstack.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I2e719075437318b7e1fef9ae481492f8d6008feb
Reported-by: Milian Wolff <mail@milianw.de>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1439815051-8616-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit deb27519bf1f)
Reviewed-on: https://chromium-review.googlesource.com/432937
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 3a12e288cc4d6d661c9e3126262d50be2e6c4567)
Reviewed-on: https://chromium-review.googlesource.com/434503

[modify] https://crrev.com/1afd331ffbec264cb42ff6c140317bc8e2a81278/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 24 by bugdroid1@chromium.org, Jan 30 2017

Labels: merge-merged-release-R57-9202.B-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/219b389b67f175a854984653e23cbdeadde5b795

commit 219b389b67f175a854984653e23cbdeadde5b795
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:34 2015

UPSTREAM: perf/x86: Add option to disable reading branch flags/cycles

With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is
not free anymore, as it has moved to a separate MSR.

For callstack mode we don't need any of this information; so we can
avoid the unnecessary MSR read. Add flags to the perf interface where
perf record can request not collecting this information.

Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual
for branch_sample_types to be negative (disable), not positive (enable),
but since the legacy ABI reported the flags we need some form of
explicit disabling to avoid breaking the ABI.

After we have the flags the x86 perf code can keep track if any users
need the flags. If noone needs it the information is not collected.

This cuts down the cost of LBR callstack on Skylake significantly.
Profiling a kernel build with LBR call stack the average run time of
the PMI handler drops by 43%.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Ibd482059493dd522063c907b170a5bab4454db37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b16a5b52eb90)
Reviewed-on: https://chromium-review.googlesource.com/432940
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit fb4a36dec0dd562139f2a983bcccf04369cade07)
Reviewed-on: https://chromium-review.googlesource.com/433340
(cherry picked from commit 5ebc9190ef5bf6a4c06c293f66ec320592c78f5f)
Reviewed-on: https://chromium-review.googlesource.com/434505

[modify] https://crrev.com/219b389b67f175a854984653e23cbdeadde5b795/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/219b389b67f175a854984653e23cbdeadde5b795/include/uapi/linux/perf_event.h

Project Member

Comment 25 by bugdroid1@chromium.org, Jan 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/8a9f3cab5fb6be45d34305dc55ea549eea1f8fb0

commit 8a9f3cab5fb6be45d34305dc55ea549eea1f8fb0
Author: Kan Liang <kan.liang@intel.com>
Date: Mon Sep 14 14:14:07 2015

UPSTREAM: perf/x86/intel: Fix static checker warning in lbr enable

Commit deb27519bf1f ("perf/x86/intel: Fix LBR callstack issue caused
by FREEZE_LBRS_ON_PMI") leads to the following Smatch complaint:

   warn: variable dereferenced before check 'cpuc->lbr_sel' (see line 154)

Fix the warning.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I6b470e7d1cdaf306cc7a5dd7cc479ccadabbf208
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: deb27519bf1f ("perf/x86/intel: Fix LBR callstack issue caused by FREEZE_LBRS_ON_PMI")
Link: http://lkml.kernel.org/r/1442240047-48149-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 96f3eda67fcf)
Reviewed-on: https://chromium-review.googlesource.com/432938
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 092e2d51cd180bb4205fe35c615f4a9061271110)
Reviewed-on: https://chromium-review.googlesource.com/434506

[modify] https://crrev.com/8a9f3cab5fb6be45d34305dc55ea549eea1f8fb0/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 26 by bugdroid1@chromium.org, Jan 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/5ec2545a71bbe2830cd48f58db25755ad34e5079

commit 5ec2545a71bbe2830cd48f58db25755ad34e5079
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:33 2015

UPSTREAM: perf/x86: Fix LBR call stack save/restore

This fixes a bug I added in the following commit:

  90405aa02247 ("perf/x86/intel/lbr: Limit LBR accesses to TOS in callstack mode")

The bug could lead to lost LBR call stacks. When restoring the LBR state
we need to use the TOS of the previous context, not the current context.
To do that we need to save/restore the TOS.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Iaa7ede615fd6fcf5e14467371934d9b1c01eaa37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b28ae9560b69)
Reviewed-on: https://chromium-review.googlesource.com/432939
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 4ee1f38a42642c12953489b092daff6c10f5bbf1)
Reviewed-on: https://chromium-review.googlesource.com/434509

[modify] https://crrev.com/5ec2545a71bbe2830cd48f58db25755ad34e5079/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/5ec2545a71bbe2830cd48f58db25755ad34e5079/arch/x86/kernel/cpu/perf_event.h

Project Member

Comment 27 by bugdroid1@chromium.org, Jan 30 2017

Labels: merge-merged-release-R57-9202.B-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/fe32d6f55a7d9362972c58972ef30abd34cc9121

commit fe32d6f55a7d9362972c58972ef30abd34cc9121
Author: Stephane Eranian <eranian@google.com>
Date: Thu Dec 03 22:33:17 2015

UPSTREAM: perf/x86: Fix LBR related crashes on Intel Atom

This patches fixes the LBR kernel crashes on Intel Atom.

The kernel was assuming that if the CPU supports 64-bit format
LBR, then it has an LBR_SELECT MSR. Atom uses 64-bit LBR format
but does not have LBR_SELECT. That was causing NULL pointer
dereferences in a couple of places.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I830effd68372fd1c74bd6a63bd68f44993952de9
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Fixes: 96f3eda67fcf ("perf/x86/intel: Fix static checker warning in lbr enable")
Link: http://lkml.kernel.org/r/1449182000-31524-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 6fc2e83077b0)
Reviewed-on: https://chromium-review.googlesource.com/432941
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 4dfe943811910cc89dddf45613ea87011b5fab67)
Reviewed-on: https://chromium-review.googlesource.com/433356
Reviewed-by: Gabriel Marin <gmx@chromium.org>
(cherry picked from commit 8c2d19165d6da7be00567426a3b891f937499bf2)
Reviewed-on: https://chromium-review.googlesource.com/434508

[modify] https://crrev.com/fe32d6f55a7d9362972c58972ef30abd34cc9121/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 28 by bugdroid1@chromium.org, Jan 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/daf9876220d0a56ca896627abc8d079dbb675eb5

commit daf9876220d0a56ca896627abc8d079dbb675eb5
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:34 2015

UPSTREAM: perf/x86: Add option to disable reading branch flags/cycles

With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is
not free anymore, as it has moved to a separate MSR.

For callstack mode we don't need any of this information; so we can
avoid the unnecessary MSR read. Add flags to the perf interface where
perf record can request not collecting this information.

Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual
for branch_sample_types to be negative (disable), not positive (enable),
but since the legacy ABI reported the flags we need some form of
explicit disabling to avoid breaking the ABI.

After we have the flags the x86 perf code can keep track if any users
need the flags. If noone needs it the information is not collected.

This cuts down the cost of LBR callstack on Skylake significantly.
Profiling a kernel build with LBR call stack the average run time of
the PMI handler drops by 43%.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Ibd482059493dd522063c907b170a5bab4454db37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b16a5b52eb90)
Reviewed-on: https://chromium-review.googlesource.com/432940
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit fb4a36dec0dd562139f2a983bcccf04369cade07)
Reviewed-on: https://chromium-review.googlesource.com/434510

[modify] https://crrev.com/daf9876220d0a56ca896627abc8d079dbb675eb5/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/daf9876220d0a56ca896627abc8d079dbb675eb5/include/uapi/linux/perf_event.h

Project Member

Comment 29 by bugdroid1@chromium.org, Jan 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/daf9876220d0a56ca896627abc8d079dbb675eb5

commit daf9876220d0a56ca896627abc8d079dbb675eb5
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:34 2015

UPSTREAM: perf/x86: Add option to disable reading branch flags/cycles

With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is
not free anymore, as it has moved to a separate MSR.

For callstack mode we don't need any of this information; so we can
avoid the unnecessary MSR read. Add flags to the perf interface where
perf record can request not collecting this information.

Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual
for branch_sample_types to be negative (disable), not positive (enable),
but since the legacy ABI reported the flags we need some form of
explicit disabling to avoid breaking the ABI.

After we have the flags the x86 perf code can keep track if any users
need the flags. If noone needs it the information is not collected.

This cuts down the cost of LBR callstack on Skylake significantly.
Profiling a kernel build with LBR call stack the average run time of
the PMI handler drops by 43%.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Ibd482059493dd522063c907b170a5bab4454db37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b16a5b52eb90)
Reviewed-on: https://chromium-review.googlesource.com/432940
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit fb4a36dec0dd562139f2a983bcccf04369cade07)
Reviewed-on: https://chromium-review.googlesource.com/434510

[modify] https://crrev.com/daf9876220d0a56ca896627abc8d079dbb675eb5/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/daf9876220d0a56ca896627abc8d079dbb675eb5/include/uapi/linux/perf_event.h

Project Member

Comment 30 by bugdroid1@chromium.org, Jan 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/daf9876220d0a56ca896627abc8d079dbb675eb5

commit daf9876220d0a56ca896627abc8d079dbb675eb5
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Oct 20 18:46:34 2015

UPSTREAM: perf/x86: Add option to disable reading branch flags/cycles

With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles is
not free anymore, as it has moved to a separate MSR.

For callstack mode we don't need any of this information; so we can
avoid the unnecessary MSR read. Add flags to the perf interface where
perf record can request not collecting this information.

Add branch_sample_type flags for CYCLES and FLAGS. It's a bit unusual
for branch_sample_types to be negative (disable), not positive (enable),
but since the legacy ABI reported the flags we need some form of
explicit disabling to avoid breaking the ABI.

After we have the flags the x86 perf code can keep track if any users
need the flags. If noone needs it the information is not collected.

This cuts down the cost of LBR callstack on Skylake significantly.
Profiling a kernel build with LBR call stack the average run time of
the PMI handler drops by 43%.

BUG= chromium:685331 
TEST=Build and run

Change-Id: Ibd482059493dd522063c907b170a5bab4454db37
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/1445366797-30894-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit b16a5b52eb90)
Reviewed-on: https://chromium-review.googlesource.com/432940
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit fb4a36dec0dd562139f2a983bcccf04369cade07)
Reviewed-on: https://chromium-review.googlesource.com/434510

[modify] https://crrev.com/daf9876220d0a56ca896627abc8d079dbb675eb5/arch/x86/kernel/cpu/perf_event_intel_lbr.c
[modify] https://crrev.com/daf9876220d0a56ca896627abc8d079dbb675eb5/include/uapi/linux/perf_event.h

Labels: -Merge-Approved-57
Merge to R57 complete.

Project Member

Comment 32 by bugdroid1@chromium.org, Jan 30 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/bf5e73b907a0eafb5b0ef625bb7de3d9d5151469

commit bf5e73b907a0eafb5b0ef625bb7de3d9d5151469
Author: Stephane Eranian <eranian@google.com>
Date: Thu Dec 03 22:33:17 2015

UPSTREAM: perf/x86: Fix LBR related crashes on Intel Atom

This patches fixes the LBR kernel crashes on Intel Atom.

The kernel was assuming that if the CPU supports 64-bit format
LBR, then it has an LBR_SELECT MSR. Atom uses 64-bit LBR format
but does not have LBR_SELECT. That was causing NULL pointer
dereferences in a couple of places.

BUG= chromium:685331 
TEST=Build and run

Change-Id: I830effd68372fd1c74bd6a63bd68f44993952de9
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Fixes: 96f3eda67fcf ("perf/x86/intel: Fix static checker warning in lbr enable")
Link: http://lkml.kernel.org/r/1449182000-31524-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 6fc2e83077b0)
Reviewed-on: https://chromium-review.googlesource.com/432941
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 4dfe943811910cc89dddf45613ea87011b5fab67)
Reviewed-on: https://chromium-review.googlesource.com/434512

[modify] https://crrev.com/bf5e73b907a0eafb5b0ef625bb7de3d9d5151469/arch/x86/kernel/cpu/perf_event_intel_lbr.c

Project Member

Comment 33 by bugdroid1@chromium.org, Mar 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/157b36c2ba2538e599f157c22db558b93b77c8e4

commit 157b36c2ba2538e599f157c22db558b93b77c8e4
Author: Andi Kleen <ak@linux.intel.com>
Date: Wed Mar 14 01:59:35 2018

UPSTREAM: perf/x86: Remove warning for zero PEBS status

The recent commit:

  75f80859b130 ("perf/x86/intel/pebs: Robustify PEBS buffer drain")

causes lots of warnings on different CPUs before Skylake
when running PEBS intensive workloads.

They can have a zero status field in the PEBS record when
PEBS is racing with clearing of GLOBAl_STATUS.

This also can cause hangs (it seems there are still
problems with printk in NMI).

Disable the warning, but still ignore the record.

BUG= chromium:685331 ,b:74020465
TEST=build kernel

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1449177740-5422-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 957ea1fdbcdb909e1540f06f06f1a9ce6e696efa)
Signed-off-by: Gabriel Marin <gmx@chromium.org>

Change-Id: Ic73b69ae090d062a677f9fa12a050ecd904bf72d
Reviewed-on: https://chromium-review.googlesource.com/957969
Commit-Ready: Gabriel Marin <gmx@chromium.org>
Tested-by: Gabriel Marin <gmx@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/157b36c2ba2538e599f157c22db558b93b77c8e4/arch/x86/kernel/cpu/perf_event_intel_ds.c

Project Member

Comment 34 by bugdroid1@chromium.org, Mar 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/95d2b4bf220bae1657b5bf1acea924d8448ae1f2

commit 95d2b4bf220bae1657b5bf1acea924d8448ae1f2
Author: Andi Kleen <ak@linux.intel.com>
Date: Wed Mar 14 01:59:37 2018

UPSTREAM: perf/x86: Allow zero PEBS status with only single active event

Normally we drop PEBS events with a zero status field. But when
there is only a single PEBS event active we can assume the
PEBS record is for that event. The PEBS buffer is always flushed
when PEBS events are disabled, so there is no risk of mishandling
state PEBS records this way.

BUG= chromium:685331 ,b:74020465
TEST=Build kernel

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1449177740-5422-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 01330d7288e0050c5aaabc558059ff91589e67cd)
Signed-off-by: Gabriel Marin <gmx@chromium.org>

Change-Id: I695b1c519bb5b8e86bccecb6383dfdc75e4f9472
Reviewed-on: https://chromium-review.googlesource.com/957970
Commit-Ready: Gabriel Marin <gmx@chromium.org>
Tested-by: Gabriel Marin <gmx@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/95d2b4bf220bae1657b5bf1acea924d8448ae1f2/arch/x86/kernel/cpu/perf_event_intel_ds.c

Project Member

Comment 35 by bugdroid1@chromium.org, Mar 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/9660578eac9287fc89bd0277d381cf5f027bc6a0

commit 9660578eac9287fc89bd0277d381cf5f027bc6a0
Author: Andi Kleen <ak@linux.intel.com>
Date: Wed Mar 14 01:59:39 2018

UPSTREAM: perf/x86: Use INST_RETIRED.TOTAL_CYCLES_PS for cycles:pp for Skylake

I added UOPS_RETIRED.ALL by mistake to the Skylake PEBS event list for
cycles:pp. But the event is not documented for Skylake, and has some
issues.

The recommended replacement for cycles:pp is to use
INST_RETIRED.ANY+pebs as a base, similar to what CPUs before Sandy
Bridge did. This new event is called INST_RETIRED.TOTAL_CYCLES_PS. The
event is not really new, but has been already used by perf before
Sandy Bridge for the original cycles:p

Note the SDM doesn't document that event either, but it's being
documented in the latest version of the event list on:

  https://download.01.org/perfmon/SKL

This patch does:

 - Remove UOPS_RETIRED.ALL from the Skylake PEBS event list

 - Add INST_RETIRED.ANY to the Skylake PEBS event list, and an table entry to
   allow cmask=16,inv=1 for cycles:pp

 - We don't need an extra entry for the base INST_RETIRED event,
   because it is already covered by the catch-all PEBS table entry.

 - Switch Skylake to use the Core2 PEBS alias (which is
   INST_RETIRED.TOTAL_CYCLES_PS)

BUG= chromium:685331 ,b:74020465
TEST=Build kernel

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1448929689-13771-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 442f5c74cbeaf54939980397ece59360c0a824e9)
Signed-off-by: Gabriel Marin <gmx@chromium.org>

Change-Id: I7b1a8abe97a00945a93a125359718451f8bc7337
Reviewed-on: https://chromium-review.googlesource.com/957971
Commit-Ready: Gabriel Marin <gmx@chromium.org>
Tested-by: Gabriel Marin <gmx@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/9660578eac9287fc89bd0277d381cf5f027bc6a0/arch/x86/kernel/cpu/perf_event_intel.c
[modify] https://crrev.com/9660578eac9287fc89bd0277d381cf5f027bc6a0/arch/x86/kernel/cpu/perf_event_intel_ds.c

Project Member

Comment 36 by bugdroid1@chromium.org, Mar 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/61ca497cf30579f84f945380ce3ff30b705bc372

commit 61ca497cf30579f84f945380ce3ff30b705bc372
Author: Andi Kleen <ak@linux.intel.com>
Date: Wed Mar 14 01:59:42 2018

UPSTREAM: perf/x86: Use INST_RETIRED.PREC_DIST for cycles: ppp

Add a new 'three-p' precise level, that uses INST_RETIRED.PREC_DIST as
base. The basic mechanism of abusing the inverse cmask to get all
cycles works the same as before.

PREC_DIST is available on Sandy Bridge or later. It had some problems
on Sandy Bridge, so we only use it on IvyBridge and later. I tested it
on Broadwell and Skylake.

PREC_DIST has special support for avoiding shadow effects, which can
give better results compare to UOPS_RETIRED. The drawback is that
PREC_DIST can only schedule on counter 1, but that is ok for cycle
sampling, as there is normally no need to do multiple cycle sampling
runs in parallel. It is still possible to run perf top in parallel, as
that doesn't use precise mode. Also of course the multiplexing can
still allow parallel operation.

:pp stays with the previous event.

Example:

Sample a loop with 10 sqrt with old cycles:pp

	  0.14 10:   sqrtps %xmm1,%xmm0     <--------------
	  9.13       sqrtps %xmm1,%xmm0
	 11.58       sqrtps %xmm1,%xmm0
	 11.51       sqrtps %xmm1,%xmm0
	  6.27       sqrtps %xmm1,%xmm0
	 10.38       sqrtps %xmm1,%xmm0
	 12.20       sqrtps %xmm1,%xmm0
	 12.74       sqrtps %xmm1,%xmm0
	  5.40       sqrtps %xmm1,%xmm0
	 10.14       sqrtps %xmm1,%xmm0
	 10.51      jmp    10

We expect all 10 sqrt to get roughly the sample number of samples.

But you can see that the instruction directly after the JMP is
systematically underestimated in the result, due to sampling shadow
effects.

With the new PREC_DIST based sampling this problem is gone and all
instructions show up roughly evenly:

	  9.51 10:   sqrtps %xmm1,%xmm0
	 11.74       sqrtps %xmm1,%xmm0
	 11.84       sqrtps %xmm1,%xmm0
	  6.05       sqrtps %xmm1,%xmm0
	 10.46       sqrtps %xmm1,%xmm0
	 12.25       sqrtps %xmm1,%xmm0
	 12.18       sqrtps %xmm1,%xmm0
	  5.26       sqrtps %xmm1,%xmm0
	 10.13       sqrtps %xmm1,%xmm0
	 10.43       sqrtps %xmm1,%xmm0
	  0.16      jmp    10

Even with PREC_DIST there is still sampling skid and the result is not
completely even, but systematic shadow effects are significantly
reduced.

The improvements are mainly expected to make a difference in high IPC
code. With low IPC it should be similar.

BUG= chromium:685331 ,b:74020465
TEST=Build kernel

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1448929689-13771-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 724697648eec540b2a7561089b1c87cb33e6a0eb)
Signed-off-by: Gabriel Marin <gmx@chromium.org>

Change-Id: I552b911e2eab77d980d1c87f3a0fa792bed853b2
Reviewed-on: https://chromium-review.googlesource.com/957972
Commit-Ready: Gabriel Marin <gmx@chromium.org>
Tested-by: Gabriel Marin <gmx@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>

[modify] https://crrev.com/61ca497cf30579f84f945380ce3ff30b705bc372/arch/x86/kernel/cpu/perf_event_intel_ds.c
[modify] https://crrev.com/61ca497cf30579f84f945380ce3ff30b705bc372/arch/x86/kernel/cpu/perf_event_intel.c
[modify] https://crrev.com/61ca497cf30579f84f945380ce3ff30b705bc372/arch/x86/kernel/cpu/perf_event.c
[modify] https://crrev.com/61ca497cf30579f84f945380ce3ff30b705bc372/arch/x86/kernel/cpu/perf_event.h

Project Member

Comment 37 by bugdroid1@chromium.org, Mar 14 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/16dad81c709341c9d4cb515931ece299fde318fd

commit 16dad81c709341c9d4cb515931ece299fde318fd
Author: Andi Kleen <ak@linux.intel.com>
Date: Wed Mar 14 01:59:45 2018

BACKPORT: perf/x86: Add model numbers for Kabylake CPUs

Everything the same as Skylake, just new model numbers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461977748-17616-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit cba1b3798e2c4c094f2079a0d4c1ba4ec2c5a9ac)
Signed-off-by: Gabriel Marin <gmx@chromium.org>

Conflicts:
	arch/x86/events/intel/core.c
	Added changes to arch/x86/kernel/cpu/perf_event_intel.c, which is the
	old location of Intel perf events before code reorganization upstream.

BUG= chromium:685331 ,b:74020465
TEST=Build and run. Verified that collection of LBR traces, and iTLB and dTLB
     misses on Kabylake works.

Change-Id: I5b5751c945ddef1f7da43012b41c4d279234066f
Reviewed-on: https://chromium-review.googlesource.com/957973
Commit-Ready: Gabriel Marin <gmx@chromium.org>
Tested-by: Gabriel Marin <gmx@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/16dad81c709341c9d4cb515931ece299fde318fd/arch/x86/kernel/cpu/perf_event_intel.c

Comment 38 by gmx@chromium.org, Mar 14 2018

Wow, sorry for the noise. I used the wrong crbug number for all these commits, as I think that I was looking at this bug at the time.

I intended it to be https://bugs.chromium.org/p/chromium/issues/detail?id=820661, so I could make a request to cherry pick to R66.

Sign in to add a comment