chromeos-4.14: Crash in arch_jump_label_transform |
|||
Issue description
0day reports a crash in early boot.
[ 0.000000] tsc: Initial usec timer 807975
[ 0.000000] tsc: Detected 2693.484 MHz processor
PANIC: early exception 0x2000e3 IP 10:ffffffff84e3ca29 error 0 cr2 0xffffea00003e0f80
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.68-07298-gcab4d3a #3
[ 0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.000000] task: ffffffff88681ac0 task.stack: ffffffff88600000
[ 0.000000] RIP: 0010:text_poke+0xb9/0x420
[ 0.000000] RSP: 0000:ffffffff88607bd8 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
[ 0.000000] RAX: 0000000000000000 RBX: ffffea00003e0f80 RCX: ffffffff84e3ca29
[ 0.000000] RDX: dffffc0000000000 RSI: ffffffff88607c58 RDI: ffffea00003e0f80
[ 0.000000] RBP: ffffffff88607c28 R08: fffffbfff0e83aa6 R09: 000000000000384d
[ 0.000000] R10: fffffbfff0e83aa5 R11: ffffffff8741d528 R12: ffffffff84e3e9f3
[ 0.000000] R13: ffffffff88607d00 R14: 0000000000000005 R15: ffffffff84e3f9f3
[ 0.000000] FS: 0000000000000000(0000) GS:ffffffff8a3a3000(0000) knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.000000] CR2: ffffea00003e0f80 CR3: 000000001508e000 CR4: 00000000000406a0
[ 0.000000] Call Trace:
[ 0.000000] ? native_sched_clock+0x63/0x130
[ 0.000000] ? native_sched_clock+0x64/0x130
[ 0.000000] text_poke_bp+0x9c/0x170
[ 0.000000] ? poke_int3_handler+0x90/0x90
[ 0.000000] ? __ww_mutex_wakeup_for_backoff+0x1b0/0x1b0
[ 0.000000] ? native_sched_clock+0x63/0x130
[ 0.000000] __jump_label_transform+0x29a/0x2e0
[ 0.000000] ? bug_at+0x50/0x50
[ 0.000000] ? debug_show_all_locks+0x2b0/0x2b0
[ 0.000000] ? vprintk_default+0x22/0x30
[ 0.000000] ? clocks_calc_mult_shift+0xc7/0xe0
[ 0.000000] arch_jump_label_transform+0x3f/0x60
[ 0.000000] jump_label_update+0x110/0x140
[ 0.000000] ? 0xffffffff84e00000
[ 0.000000] static_key_enable_cpuslocked+0xd1/0x110
[ 0.000000] static_key_enable+0x25/0x40
[ 0.000000] tsc_early_init+0xa0/0xaa
[ 0.000000] setup_arch+0x68a/0x1054
[ 0.000000] ? cgroup_init_early+0x170/0x23a
[ 0.000000] start_kernel+0xdc/0x8e3
[ 0.000000] ? thread_stack_cache_init+0xd/0xd
[ 0.000000] x86_64_start_reservations+0x46/0x4f
[ 0.000000] x86_64_start_kernel+0xd0/0xda
[ 0.000000] secondary_startup_64+0xa5/0xb0
[ 0.000000] Code: 3d 99 05 01 48 8b 05 07 76 83 03 48 01 c3 48 b8 00 00 00 00 00 ea ff ff 48 c1 eb 0c 48 c1 e3 06 48 01 c3 48 89 df e8 97 dc 36 00 <48> 8b 03 f6 c4 08 75 12 48 83 05 af 3d 99 05 01 0f 0b 48 83 05
BUG: kernel hang in boot stage
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 72b2b3951cedd938caf97bf4d0dfc4b7f0c5e096 v4.14 --
git bisect good 07f863f4a402f46e5347f5049ea314f2e45dc920 # 11:13 G 11 0 5 5 UPSTREAM: net: qualcomm: rmnet: Process packets over ethernet
git bisect good b5b35e035ff8e9609fac938db2c0b49fd2bbfa86 # 12:06 G 10 0 2 2 UPSTREAM: iommu: Clean up the comments for iommu_group_alloc
git bisect good 32f99835ef6b14616b8b19ad026aecdf08274c89 # 12:22 G 11 0 6 9 UPSTREAM: drm/amd/display: Move MAX_TMDS_CLOCK define to header
git bisect good 9763c7e7bae8e625107cffe7d05ecf942064b774 # 12:35 G 11 0 3 4 CHROMIUM: virtio/wl: Fix a missing mutex unlock in error path
git bisect good ac21102ae8c42d0485fe5d27048154df8dc6e79d # 13:25 G 11 0 1 1 UPSTREAM: drm/msm/disp/dpu: fix early dereference of physical encoder
git bisect good cfa82bd37299c7b15743c3aba66e812fe9af7ed4 # 14:19 G 11 0 3 3 BACKPORT: treewide: Use struct_size() for devm_kmalloc() and friends
git bisect good 0aaac53dc61f3bb4b35d3246e96127fab93481da # 14:36 G 10 0 3 9 Revert "CHROMIUM: config: disable VCE block in amdgpu"
git bisect good 8324e66619b0bbf2b14bb633b9f9d244c8823a2c # 14:47 G 11 0 11 15 UPSTREAM: x86/tsc: Split native_calibrate_cpu() into early and late parts
git bisect bad 09d46f32262c3972f0c1477851690e3471f8aa28 # 15:04 B 1 10 1 8 CHROMIUM: drm/i915: GLK will support minimum cdclk as 158.4 to enable Audio.
git bisect bad 536baf621bc82d256743beb515a019d0df979b8e # 15:20 B 1 9 1 6 UPSTREAM: x86/tsc: Consolidate init code
git bisect good d07afa8554be2a200670978813e1c93f0e763ccf # 15:34 G 11 0 11 29 UPSTREAM: x86/tsc: Make CONFIG_X86_TSC=n build work again
git bisect bad cab4d3a04f824ba391a02b3fc9b9d15c97e710e2 # 15:49 B 0 11 27 2 BACKPORT: x86/jump_label: Initialize static branching early
# first bad commit: [cab4d3a04f824ba391a02b3fc9b9d15c97e710e2] BACKPORT: x86/jump_label: Initialize static branching early
git bisect good d07afa8554be2a200670978813e1c93f0e763ccf # 15:54 G 33 0 33 62 UPSTREAM: x86/tsc: Make CONFIG_X86_TSC=n build work again
# extra tests with debug options
git bisect bad cab4d3a04f824ba391a02b3fc9b9d15c97e710e2 # 16:12 B 0 11 25 0 BACKPORT: x86/jump_label: Initialize static branching early
# extra tests on HEAD of internal-chrome-os/chromeos-4.14
git bisect bad 6fcadeebf611353ebd052890290ded43f0329098 # 16:13 B 0 13 30 0 CHROMIUM: x86: x86_64_arcvm_defconfig: Enable esdfs.
# extra tests on tree/branch chrome-os/chromeos-4.14
git bisect bad 78ff58b31e485a48f276289cb2df433ef951c96c # 22:32 B 0 2 17 1 CHROMIUM: stack chromiumos LSM before other LSMs
# extra tests with first bad commit reverted
git bisect good 9b7d235d6cd8dce32e708a269e36f745f78293df # 09:40 G 11 0 11 11 Revert "BACKPORT: x86/jump_label: Initialize static branching early"
Analysis suggests that a number of jump label related patches are missing in chromeos-4.14. Commit 6fffacb30349e09 ("x86/alternatives, jumplabel: Use text_poke_early() before mm_init()") is probably the most important patch that needs to be applied.
The problem is only seen with CONFIG_JUMP_LABEL=y.
,
Sep 22
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/164f25d82f0fcd5ae2f600411d2a21d929157df3 commit 164f25d82f0fcd5ae2f600411d2a21d929157df3 Author: Peter Zijlstra <peterz@infradead.org> Date: Sat Sep 22 15:29:55 2018 UPSTREAM: sched/core: Fix cpu.max vs. cpuhotplug deadlock Tejun reported the following cpu-hotplug lock (percpu-rwsem) read recursion: tg_set_cfs_bandwidth() get_online_cpus() cpus_read_lock() cfs_bandwidth_usage_inc() static_key_slow_inc() cpus_read_lock() Reported-by: Tejun Heo <tj@kernel.org> Tested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180122215328.GP3397@worktop Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit ce48c146495a1a50e48cdbfbfaba3e708be7c07c) BUG= chromium:887805 TEST=Run image with JUMPLABEL enabled Change-Id: Ifba7fad0480d97e03ec28f8946df9b791861e8a2 Signed-off-by: Guenter Roeck <groeck@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1238714 Reviewed-by: Yu Zhao <yuzhao@chromium.org> [modify] https://crrev.com/164f25d82f0fcd5ae2f600411d2a21d929157df3/include/linux/jump_label.h [modify] https://crrev.com/164f25d82f0fcd5ae2f600411d2a21d929157df3/kernel/sched/fair.c [modify] https://crrev.com/164f25d82f0fcd5ae2f600411d2a21d929157df3/kernel/jump_label.c
,
Sep 22
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/b6a8c30c732177cc73a35b10fff2be08ff76e7e6 commit b6a8c30c732177cc73a35b10fff2be08ff76e7e6 Author: Josh Poimboeuf <jpoimboe@redhat.com> Date: Sat Sep 22 15:29:56 2018 UPSTREAM: jump_label: Explicitly disable jump labels in __init code After initmem has been freed, any jump labels in __init code are prevented from being written to by the kernel_text_address() check in __jump_label_update(). However, this check is quite broad. If kernel_text_address() were to return false for any other reason, the jump label write would fail silently with no warning. For jump labels in module init code, entry->code is set to zero to indicate that the entry is disabled. Do the same thing for core kernel init code. This makes the behavior more consistent, and will also make it more straightforward to detect non-init jump label write failures in the next patch. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/c52825c73f3a174e8398b6898284ec20d4deb126.1519051220.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 33352244706369ea6736781ae41fe41692eb69bb) BUG= chromium:887805 TEST=Run image with JUMPLABEL enabled Change-Id: I9db4d48a7ff6efc492a9738872cc76a608337844 Signed-off-by: Guenter Roeck <groeck@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1238715 Reviewed-by: Yu Zhao <yuzhao@chromium.org> [modify] https://crrev.com/b6a8c30c732177cc73a35b10fff2be08ff76e7e6/init/main.c [modify] https://crrev.com/b6a8c30c732177cc73a35b10fff2be08ff76e7e6/include/linux/jump_label.h [modify] https://crrev.com/b6a8c30c732177cc73a35b10fff2be08ff76e7e6/kernel/jump_label.c
,
Sep 22
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/15a57cc54327f00dbaf3db694f602f3d5424f544 commit 15a57cc54327f00dbaf3db694f602f3d5424f544 Author: Pavel Tatashin <pasha.tatashin@oracle.com> Date: Sat Sep 22 15:29:58 2018 UPSTREAM: x86/alternatives, jumplabel: Use text_poke_early() before mm_init() It supposed to be safe to modify static branches after jump_label_init(). But, because static key modifying code eventually calls text_poke() it can end up accessing a struct page which has not been initialized yet. Here is how to quickly reproduce the problem. Insert code like this into init/main.c: | +static DEFINE_STATIC_KEY_FALSE(__test); | asmlinkage __visible void __init start_kernel(void) | { | char *command_line; |@@ -587,6 +609,10 @@ asmlinkage __visible void __init start_kernel(void) | vfs_caches_init_early(); | sort_main_extable(); | trap_init(); |+ { |+ static_branch_enable(&__test); |+ WARN_ON(!static_branch_likely(&__test)); |+ } | mm_init(); The following warnings show-up: WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:701 text_poke+0x20d/0x230 RIP: 0010:text_poke+0x20d/0x230 Call Trace: ? text_poke_bp+0x50/0xda ? arch_jump_label_transform+0x89/0xe0 ? __jump_label_update+0x78/0xb0 ? static_key_enable_cpuslocked+0x4d/0x80 ? static_key_enable+0x11/0x20 ? start_kernel+0x23e/0x4c8 ? secondary_startup_64+0xa5/0xb0 ---[ end trace abdc99c031b8a90a ]--- If the code above is moved after mm_init(), no warning is shown, as struct pages are initialized during handover from memblock. Use text_poke_early() in static branching until early boot IRQs are enabled and from there switch to text_poke. Also, ensure text_poke() is never invoked when unitialized memory access may happen by using adding a !after_bootmem assertion. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: peterz@infradead.org Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-9-pasha.tatashin@oracle.com (cherry picked from commit 6fffacb30349e0903602d664f7ab6fc87e85162e) BUG= chromium:887805 TEST=Run image with JUMPLABEL enabled Change-Id: I04163a4b348d79a7c7c6811201a37b55d60f7078 Signed-off-by: Guenter Roeck <groeck@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1238716 Reviewed-by: Yu Zhao <yuzhao@chromium.org> [modify] https://crrev.com/15a57cc54327f00dbaf3db694f602f3d5424f544/arch/x86/include/asm/text-patching.h [modify] https://crrev.com/15a57cc54327f00dbaf3db694f602f3d5424f544/arch/x86/kernel/jump_label.c [modify] https://crrev.com/15a57cc54327f00dbaf3db694f602f3d5424f544/arch/x86/kernel/alternative.c
,
Sep 22
,
Sep 26
|
|||
►
Sign in to add a comment |
|||
Comment 1 by bugdroid1@chromium.org
, Sep 22