New issue
Advanced search Search tips

Issue 887805 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 26
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

chromeos-4.14: Crash in arch_jump_label_transform

Project Member Reported by groeck@chromium.org, Sep 21

Issue description

0day reports a crash in early boot.

[    0.000000] tsc: Initial usec timer 807975
[    0.000000] tsc: Detected 2693.484 MHz processor
PANIC: early exception 0x2000e3 IP 10:ffffffff84e3ca29 error 0 cr2 0xffffea00003e0f80
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.68-07298-gcab4d3a #3
[    0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[    0.000000] task: ffffffff88681ac0 task.stack: ffffffff88600000
[    0.000000] RIP: 0010:text_poke+0xb9/0x420
[    0.000000] RSP: 0000:ffffffff88607bd8 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
[    0.000000] RAX: 0000000000000000 RBX: ffffea00003e0f80 RCX: ffffffff84e3ca29
[    0.000000] RDX: dffffc0000000000 RSI: ffffffff88607c58 RDI: ffffea00003e0f80
[    0.000000] RBP: ffffffff88607c28 R08: fffffbfff0e83aa6 R09: 000000000000384d
[    0.000000] R10: fffffbfff0e83aa5 R11: ffffffff8741d528 R12: ffffffff84e3e9f3
[    0.000000] R13: ffffffff88607d00 R14: 0000000000000005 R15: ffffffff84e3f9f3
[    0.000000] FS:  0000000000000000(0000) GS:ffffffff8a3a3000(0000) knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.000000] CR2: ffffea00003e0f80 CR3: 000000001508e000 CR4: 00000000000406a0
[    0.000000] Call Trace:
[    0.000000]  ? native_sched_clock+0x63/0x130
[    0.000000]  ? native_sched_clock+0x64/0x130
[    0.000000]  text_poke_bp+0x9c/0x170
[    0.000000]  ? poke_int3_handler+0x90/0x90
[    0.000000]  ? __ww_mutex_wakeup_for_backoff+0x1b0/0x1b0
[    0.000000]  ? native_sched_clock+0x63/0x130
[    0.000000]  __jump_label_transform+0x29a/0x2e0
[    0.000000]  ? bug_at+0x50/0x50
[    0.000000]  ? debug_show_all_locks+0x2b0/0x2b0
[    0.000000]  ? vprintk_default+0x22/0x30
[    0.000000]  ? clocks_calc_mult_shift+0xc7/0xe0
[    0.000000]  arch_jump_label_transform+0x3f/0x60
[    0.000000]  jump_label_update+0x110/0x140
[    0.000000]  ? 0xffffffff84e00000
[    0.000000]  static_key_enable_cpuslocked+0xd1/0x110
[    0.000000]  static_key_enable+0x25/0x40
[    0.000000]  tsc_early_init+0xa0/0xaa
[    0.000000]  setup_arch+0x68a/0x1054
[    0.000000]  ? cgroup_init_early+0x170/0x23a
[    0.000000]  start_kernel+0xdc/0x8e3
[    0.000000]  ? thread_stack_cache_init+0xd/0xd
[    0.000000]  x86_64_start_reservations+0x46/0x4f
[    0.000000]  x86_64_start_kernel+0xd0/0xda
[    0.000000]  secondary_startup_64+0xa5/0xb0
[    0.000000] Code: 3d 99 05 01 48 8b 05 07 76 83 03 48 01 c3 48 b8 00 00 00 00 00 ea ff ff 48 c1 eb 0c 48 c1 e3 06 48 01 c3 48 89 df e8 97 dc 36 00 <48> 8b 03 f6 c4 08 75 12 48 83 05 af 3d 99 05 01 0f 0b 48 83 05 
BUG: kernel hang in boot stage


                                                          # HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 72b2b3951cedd938caf97bf4d0dfc4b7f0c5e096 v4.14 --
git bisect good 07f863f4a402f46e5347f5049ea314f2e45dc920  # 11:13  G     11     0    5   5  UPSTREAM: net: qualcomm: rmnet: Process packets over ethernet
git bisect good b5b35e035ff8e9609fac938db2c0b49fd2bbfa86  # 12:06  G     10     0    2   2  UPSTREAM: iommu: Clean up the comments for iommu_group_alloc
git bisect good 32f99835ef6b14616b8b19ad026aecdf08274c89  # 12:22  G     11     0    6   9  UPSTREAM: drm/amd/display: Move MAX_TMDS_CLOCK define to header
git bisect good 9763c7e7bae8e625107cffe7d05ecf942064b774  # 12:35  G     11     0    3   4  CHROMIUM: virtio/wl: Fix a missing mutex unlock in error path
git bisect good ac21102ae8c42d0485fe5d27048154df8dc6e79d  # 13:25  G     11     0    1   1  UPSTREAM: drm/msm/disp/dpu: fix early dereference of physical encoder
git bisect good cfa82bd37299c7b15743c3aba66e812fe9af7ed4  # 14:19  G     11     0    3   3  BACKPORT: treewide: Use struct_size() for devm_kmalloc() and friends
git bisect good 0aaac53dc61f3bb4b35d3246e96127fab93481da  # 14:36  G     10     0    3   9  Revert "CHROMIUM: config: disable VCE block in amdgpu"
git bisect good 8324e66619b0bbf2b14bb633b9f9d244c8823a2c  # 14:47  G     11     0   11  15  UPSTREAM: x86/tsc: Split native_calibrate_cpu() into early and late parts
git bisect  bad 09d46f32262c3972f0c1477851690e3471f8aa28  # 15:04  B      1    10    1   8  CHROMIUM: drm/i915: GLK will support minimum cdclk as 158.4 to enable Audio.
git bisect  bad 536baf621bc82d256743beb515a019d0df979b8e  # 15:20  B      1     9    1   6  UPSTREAM: x86/tsc: Consolidate init code
git bisect good d07afa8554be2a200670978813e1c93f0e763ccf  # 15:34  G     11     0   11  29  UPSTREAM: x86/tsc: Make CONFIG_X86_TSC=n build work again
git bisect  bad cab4d3a04f824ba391a02b3fc9b9d15c97e710e2  # 15:49  B      0    11   27   2  BACKPORT: x86/jump_label: Initialize static branching early
# first bad commit: [cab4d3a04f824ba391a02b3fc9b9d15c97e710e2] BACKPORT: x86/jump_label: Initialize static branching early
git bisect good d07afa8554be2a200670978813e1c93f0e763ccf  # 15:54  G     33     0   33  62  UPSTREAM: x86/tsc: Make CONFIG_X86_TSC=n build work again
# extra tests with debug options
git bisect  bad cab4d3a04f824ba391a02b3fc9b9d15c97e710e2  # 16:12  B      0    11   25   0  BACKPORT: x86/jump_label: Initialize static branching early
# extra tests on HEAD of internal-chrome-os/chromeos-4.14
git bisect  bad 6fcadeebf611353ebd052890290ded43f0329098  # 16:13  B      0    13   30   0  CHROMIUM: x86: x86_64_arcvm_defconfig: Enable esdfs.
# extra tests on tree/branch chrome-os/chromeos-4.14
git bisect  bad 78ff58b31e485a48f276289cb2df433ef951c96c  # 22:32  B      0     2   17   1  CHROMIUM: stack chromiumos LSM before other LSMs
# extra tests with first bad commit reverted
git bisect good 9b7d235d6cd8dce32e708a269e36f745f78293df  # 09:40  G     11     0   11  11  Revert "BACKPORT: x86/jump_label: Initialize static branching early"

Analysis suggests that a number of jump label related patches are missing in chromeos-4.14. Commit 6fffacb30349e09 ("x86/alternatives, jumplabel: Use text_poke_early() before mm_init()") is probably the most important patch that needs to be applied.

The problem is only seen with CONFIG_JUMP_LABEL=y.

 
Project Member

Comment 1 by bugdroid1@chromium.org, Sep 22

Labels: merge-merged-chromeos-4.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/c2999b34c951697169fe5697ecf4d12410834448

commit c2999b34c951697169fe5697ecf4d12410834448
Author: Borislav Petkov <bp@suse.de>
Date: Sat Sep 22 15:29:53 2018

UPSTREAM: locking/static_keys: Improve uninitialized key warning

Right now it says:

  static_key_disable_cpuslocked used before call to jump_label_init
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:161 static_key_disable_cpuslocked+0x68/0x70
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.0-rc5+ #1
  Hardware name: SGI.COM C2112-4GP3/X10DRT-P-Series, BIOS 2.0a 05/09/2016
  task: ffffffff81c0e480 task.stack: ffffffff81c00000
  RIP: 0010:static_key_disable_cpuslocked+0x68/0x70
  RSP: 0000:ffffffff81c03ef0 EFLAGS: 00010096 ORIG_RAX: 0000000000000000
  RAX: 0000000000000041 RBX: ffffffff81c32680 RCX: ffffffff81c5cbf8
  RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000002
  RBP: ffff88807fffd240 R08: 726f666562206465 R09: 0000000000000136
  R10: 0000000000000000 R11: 696e695f6c656261 R12: ffffffff82158900
  R13: ffffffff8215f760 R14: 0000000000000001 R15: 0000000000000008
  FS:  0000000000000000(0000) GS:ffff883f7f400000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff88807ffff000 CR3: 0000000001c09000 CR4: 00000000000606b0
  Call Trace:
   static_key_disable+0x16/0x20
   start_kernel+0x15a/0x45d
   ? load_ucode_intel_bsp+0x11/0x2d
   secondary_startup_64+0xa5/0xb0
  Code: 48 c7 c7 a0 15 cf 81 e9 47 53 4b 00 48 89 df e8 5f fc ff ff eb e8 48 c7 c6 \
	c0 97 83 81 48 c7 c7 d0 ff a2 81 31 c0 e8 c5 9d f5 ff <0f> ff eb a7 0f ff eb \
	b0 e8 eb a2 4b 00 53 48 89 fb e8 42 0e f0

but it doesn't tell me which key it is. So dump the key's name too:

  static_key_disable_cpuslocked(): static key 'virt_spin_lock_key' used before call to jump_label_init()

And that makes pinpointing which key is causing that a lot easier.

 include/linux/jump_label.h           |   14 +++++++-------
 include/linux/jump_label_ratelimit.h |    6 +++---
 kernel/jump_label.c                  |   14 +++++++-------
 3 files changed, 17 insertions(+), 17 deletions(-)

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171018152428.ffjgak4o25f7ept6@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 5cdda5117e125e0dbb020425cc55a4c143c6febc)

BUG= chromium:887805 
TEST=Run image with JUMPLABEL enabled

Change-Id: I09710b9c899488638ae921e02f016406b8b54676
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1238713
Reviewed-by: Yu Zhao <yuzhao@chromium.org>

[modify] https://crrev.com/c2999b34c951697169fe5697ecf4d12410834448/include/linux/jump_label.h
[modify] https://crrev.com/c2999b34c951697169fe5697ecf4d12410834448/kernel/jump_label.c
[modify] https://crrev.com/c2999b34c951697169fe5697ecf4d12410834448/include/linux/jump_label_ratelimit.h

Project Member

Comment 2 by bugdroid1@chromium.org, Sep 22

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/164f25d82f0fcd5ae2f600411d2a21d929157df3

commit 164f25d82f0fcd5ae2f600411d2a21d929157df3
Author: Peter Zijlstra <peterz@infradead.org>
Date: Sat Sep 22 15:29:55 2018

UPSTREAM: sched/core: Fix cpu.max vs. cpuhotplug deadlock

Tejun reported the following cpu-hotplug lock (percpu-rwsem) read recursion:

  tg_set_cfs_bandwidth()
    get_online_cpus()
      cpus_read_lock()

    cfs_bandwidth_usage_inc()
      static_key_slow_inc()
        cpus_read_lock()

Reported-by: Tejun Heo <tj@kernel.org>
Tested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180122215328.GP3397@worktop
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit ce48c146495a1a50e48cdbfbfaba3e708be7c07c)

BUG= chromium:887805 
TEST=Run image with JUMPLABEL enabled

Change-Id: Ifba7fad0480d97e03ec28f8946df9b791861e8a2
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1238714
Reviewed-by: Yu Zhao <yuzhao@chromium.org>

[modify] https://crrev.com/164f25d82f0fcd5ae2f600411d2a21d929157df3/include/linux/jump_label.h
[modify] https://crrev.com/164f25d82f0fcd5ae2f600411d2a21d929157df3/kernel/sched/fair.c
[modify] https://crrev.com/164f25d82f0fcd5ae2f600411d2a21d929157df3/kernel/jump_label.c

Project Member

Comment 3 by bugdroid1@chromium.org, Sep 22

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/b6a8c30c732177cc73a35b10fff2be08ff76e7e6

commit b6a8c30c732177cc73a35b10fff2be08ff76e7e6
Author: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Sat Sep 22 15:29:56 2018

UPSTREAM: jump_label: Explicitly disable jump labels in __init code

After initmem has been freed, any jump labels in __init code are
prevented from being written to by the kernel_text_address() check in
__jump_label_update().  However, this check is quite broad.  If
kernel_text_address() were to return false for any other reason, the
jump label write would fail silently with no warning.

For jump labels in module init code, entry->code is set to zero to
indicate that the entry is disabled.  Do the same thing for core kernel
init code.  This makes the behavior more consistent, and will also make
it more straightforward to detect non-init jump label write failures in
the next patch.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c52825c73f3a174e8398b6898284ec20d4deb126.1519051220.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 33352244706369ea6736781ae41fe41692eb69bb)

BUG= chromium:887805 
TEST=Run image with JUMPLABEL enabled

Change-Id: I9db4d48a7ff6efc492a9738872cc76a608337844
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1238715
Reviewed-by: Yu Zhao <yuzhao@chromium.org>

[modify] https://crrev.com/b6a8c30c732177cc73a35b10fff2be08ff76e7e6/init/main.c
[modify] https://crrev.com/b6a8c30c732177cc73a35b10fff2be08ff76e7e6/include/linux/jump_label.h
[modify] https://crrev.com/b6a8c30c732177cc73a35b10fff2be08ff76e7e6/kernel/jump_label.c

Project Member

Comment 4 by bugdroid1@chromium.org, Sep 22

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/15a57cc54327f00dbaf3db694f602f3d5424f544

commit 15a57cc54327f00dbaf3db694f602f3d5424f544
Author: Pavel Tatashin <pasha.tatashin@oracle.com>
Date: Sat Sep 22 15:29:58 2018

UPSTREAM: x86/alternatives, jumplabel: Use text_poke_early() before mm_init()

It supposed to be safe to modify static branches after jump_label_init().
But, because static key modifying code eventually calls text_poke() it can
end up accessing a struct page which has not been initialized yet.

Here is how to quickly reproduce the problem. Insert code like this
into init/main.c:

| +static DEFINE_STATIC_KEY_FALSE(__test);
| asmlinkage __visible void __init start_kernel(void)
| {
|        char *command_line;
|@@ -587,6 +609,10 @@ asmlinkage __visible void __init start_kernel(void)
|        vfs_caches_init_early();
|        sort_main_extable();
|        trap_init();
|+       {
|+       static_branch_enable(&__test);
|+       WARN_ON(!static_branch_likely(&__test));
|+       }
|        mm_init();

The following warnings show-up:
WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:701 text_poke+0x20d/0x230
RIP: 0010:text_poke+0x20d/0x230
Call Trace:
 ? text_poke_bp+0x50/0xda
 ? arch_jump_label_transform+0x89/0xe0
 ? __jump_label_update+0x78/0xb0
 ? static_key_enable_cpuslocked+0x4d/0x80
 ? static_key_enable+0x11/0x20
 ? start_kernel+0x23e/0x4c8
 ? secondary_startup_64+0xa5/0xb0

---[ end trace abdc99c031b8a90a ]---

If the code above is moved after mm_init(), no warning is shown, as struct
pages are initialized during handover from memblock.

Use text_poke_early() in static branching until early boot IRQs are enabled
and from there switch to text_poke. Also, ensure text_poke() is never
invoked when unitialized memory access may happen by using adding a
!after_bootmem assertion.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: steven.sistare@oracle.com
Cc: daniel.m.jordan@oracle.com
Cc: linux@armlinux.org.uk
Cc: schwidefsky@de.ibm.com
Cc: heiko.carstens@de.ibm.com
Cc: john.stultz@linaro.org
Cc: sboyd@codeaurora.org
Cc: hpa@zytor.com
Cc: douly.fnst@cn.fujitsu.com
Cc: peterz@infradead.org
Cc: prarit@redhat.com
Cc: feng.tang@intel.com
Cc: pmladek@suse.com
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: linux-s390@vger.kernel.org
Cc: boris.ostrovsky@oracle.com
Cc: jgross@suse.com
Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-9-pasha.tatashin@oracle.com
(cherry picked from commit 6fffacb30349e0903602d664f7ab6fc87e85162e)

BUG= chromium:887805 
TEST=Run image with JUMPLABEL enabled

Change-Id: I04163a4b348d79a7c7c6811201a37b55d60f7078
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1238716
Reviewed-by: Yu Zhao <yuzhao@chromium.org>

[modify] https://crrev.com/15a57cc54327f00dbaf3db694f602f3d5424f544/arch/x86/include/asm/text-patching.h
[modify] https://crrev.com/15a57cc54327f00dbaf3db694f602f3d5424f544/arch/x86/kernel/jump_label.c
[modify] https://crrev.com/15a57cc54327f00dbaf3db694f602f3d5424f544/arch/x86/kernel/alternative.c

Status: Started (was: Assigned)
Status: Fixed (was: Started)

Sign in to add a comment