/sys/kernel/debug/tracing is broken for 3.18 (at least) |
|||||
Issue descriptionBoth on cyan and caroline (both 3.18) I get this panic: localhost ~ # echo function > /sys/kernel/debug/tracing/current_tracer [ 22.592228] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 22.592254] BUG: unable to handle kernel paging request at ffff88017a278000 [ 22.592303] IP: [<ffffffff9af7725a>] _raw_spin_unlock_irqrestore+0x0/0x2a [ 22.592343] PGD 1b73f067 PUD 1b742067 PMD 800000017a2001e3 [ 22.592406] Oops: 0011 [#1] PREEMPT SMP [ 22.594578] gsmi: Log Shutdown Reason 0x03 [ 22.594601] Modules linked in: cmac rfcomm uinput iwlmvm iwlwifi snd_soc_dmic iwl7000_mac80211 btusb snd_soc_skl_nau88l25_ssm4567 snd_soc_hdac_hdmi snd_soc_skl snd_soc_skl_ipc snd_soc_sst_acpi snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core memconsole_x86_legacy memconsole btrtl snd_hda_core btbcm btintel uvcvideo zram bluetooth snd_soc_nau8825 bridge fuse stp videobuf2_vmalloc videobuf2_memops videobuf2_core llc snd_soc_ssm4567 ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_mark snd_seq_dummy cfg80211 iio_trig_sysfs cros_ec_sensors_ring snd_seq_midi snd_seq_midi_event snd_rawmidi cros_ec_sensors snd_seq cros_ec_sensors_core industrialio_triggered_buffer kfifo_buf industrialio ip6table_filter snd_seq_device r8152 mii joydev [ 22.594746] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 3.18.0 #1 [ 22.594753] Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.349.0 11/17/2017 [ 22.594762] task: ffff88017a226460 ti: ffff88017a278000 task.ti: ffff88017a278000 [ 22.594770] RIP: 3bc1:[<ffffffff9af771e0>] [<ffffffff9af771e0>] _raw_spin_lock_irqsave+0x0/0x22 [ 22.594782] RSP: ffffffff9af771e0:ffff88017ec8af58 EFLAGS: 00010046 [ 22.594789] RAX: 0000000000000005 RBX: ffff88017ec8aec0 RCX: 0000000000000005 [ 22.594796] RDX: ffff88017ec8aef8 RSI: ffffffff9a9fb048 RDI: ffff88017ec93a40 [ 22.594803] RBP: ffffffff9a9fad1a R08: ffffffff9af771e0 R09: ffffffff9af771e0 [ 22.594810] R10: ffff88017a0045a0 R11: ffff88017a226460 R12: 0000000000000ee4 [ 22.594818] R13: ffff88017a278000 R14: ffffffff9a8b6feb R15: ffff88017ec8aea8 [ 22.594826] FS: 0000000000000000(0000) GS:ffff88017ec80000(0000) knlGS:0000000000000000 [ 22.594834] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 22.594840] CR2: ffff88017a278000 CR3: 000000001b418000 CR4: 0000000000360770 [ 22.594847] [ 22.594850] RIP [<ffffffff9af771e0>] _raw_spin_lock_irqsave+0x0/0x22 [ 22.594859] RSP <ffff88017ec8af58> [ 22.594863] CR2: ffff88017a278000 [ 22.594868] ---[ end trace aab15d72dbefc14b ]--- [ 22.600809] Kernel panic - not syncing: Fatal exception [ 22.600822] Kernel Offset: 0x19800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 22.600992] gsmi: Log Shutdown Reason 0x02 [ 22.607078] ACPI MEMORY or I/O RESET_REG. The same panic occurs after applying a set of patches (about 180 of them), at least one of which purports to fix a problem with /sys/kernel/debug/tracing (probably a different problem). git fetch https://chromium.googlesource.com/chromiumos/third_party/kernel refs/changes/06/1068506/1 && git checkout FETCH_HEAD My previous image on cyan had a January 11 timestamp, and that worked fine. Sonny is taking a look.
,
May 22 2018
Thanks. I'll do a bisect.
,
May 22 2018
I'll bet it's KPTI missing a mapping somewhere...
,
May 22 2018
semenzato@luigi:~/.../third_party/kernel/v3.18$ git bisect log # bad: [a2b13ac7448907a7cc272f1cb16e8e6e520092bd] UPSTREAM: KVM: X86: Fix load RFLAGS w/o the fixed bit # good: [cf4d5fa114d4825eb59f82e48ccd683eb9a4b442] UPSTREAM: crypto: salsa20 - fix blkcipher_walk API usage git bisect start 'a2b13ac7448907a7cc272f1cb16e8e6e520092bd' 'cf4d5fa114d4825eb' # good: [fc3884617930ff550c3192353e22ff5a1d97111c] CHROMIUM: iwl7000: chromeOS: move genetlink.h and crypto.h below rhashtable.h git bisect good fc3884617930ff550c3192353e22ff5a1d97111c # skip: [953ddc2e38dd1219dc9e656af711428bb2c83de4] ANDROID: FROMLIST: binder: add spinlock to protect binder_node git bisect skip 953ddc2e38dd1219dc9e656af711428bb2c83de4 # good: [e8eae3e19046d9cda7e11d96ba8e42e862c9afcb] ANDROID: FROMLIST: android: binder: Add allocator selftest git bisect good e8eae3e19046d9cda7e11d96ba8e42e862c9afcb # bad: [479af323cfd21deb24890cf6c7b3c881aa1e7687] UPSTREAM: x86, tls, ldt: Stop checking lm in LDT_empty git bisect bad 479af323cfd21deb24890cf6c7b3c881aa1e7687 # bad: [70a45c0200e3436587ed1a698927a387bbff1a7c] UPSTREAM: ALSA: seq: Don't allow resizing pool in use git bisect bad 70a45c0200e3436587ed1a698927a387bbff1a7c # good: [1b2110766b1e20761e32454f31f02381a8c0c2d3] BACKPORT: tpm_tis: Move ilb_base_addr to tpm_tis_data git bisect good 1b2110766b1e20761e32454f31f02381a8c0c2d3 # good: [7b395110babfce6eff2db92067e4923659f8866b] CHROMIUM: iwl7000: mvm: allow unknown crypto RX outside aggregation git bisect good 7b395110babfce6eff2db92067e4923659f8866b # skip: [8df5862c8b40b48f54102968f8210749aa211fd4] CHROMIUM: iwl7000: Merge core35 driver updates git bisect skip 8df5862c8b40b48f54102968f8210749aa211fd4 # bad: [b3f52b891347ee623ec4e5ce6ee637d7005aab7b] CHROMIUM: Revert "net: socket ioctl to reset connections matching local address" git bisect bad b3f52b891347ee623ec4e5ce6ee637d7005aab7b # bad: [330699d6c2a0d8ef5c32066c750ab8a4d107c524] BACKPORT: x86/entry/64: Don't use IST entry for #BP stack git bisect bad 330699d6c2a0d8ef5c32066c750ab8a4d107c524 # good: [841e3be21933ab800961ea9cf7847f0582fe72ed] BACKPORT: tpm: Keep CLKRUN enabled throughout the duration of transmit_cmd() git bisect good 841e3be21933ab800961ea9cf7847f0582fe72ed So this seems to be the bad patch. I know Andy. Should Guido and I have a chat with him? commit 330699d6c2a0d8ef5c32066c750ab8a4d107c524 (HEAD, refs/bisect/bad) Author: Andy Lutomirski <luto@kernel.org> AuthorDate: Thu Jul 23 15:37:48 2015 -0700 Commit: chrome-bot <chrome-bot@chromium.org> CommitDate: Wed Apr 11 14:55:44 2018 -0700 BACKPORT: x86/entry/64: Don't use IST entry for #BP stack There's nothing IST-worthy about #BP/int3. We don't allow kprobes in the small handful of places in the kernel that run at CPL0 with an invalid stack, and 32-bit kernels have used normal interrupt gates for #BP forever. Furthermore, we don't allow kprobes in places that have usergs while in kernel mode, so "paranoid" is also unnecessary. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org BUG=chromium:831269 TEST=Build and boot Change-Id: I2bb5aece3c789aa3d845d03c9b57f10daf30fd60 [backport: Context and file name changes. Changes in trap_init do not apply.] Signed-off-by: Guenter Roeck <groeck@chromium.org> (cherry picked from commit d8ba61ba58c88d5207c1ba2f7d9a2280e7d03be9) Reviewed-on: https://chromium-review.googlesource.com/1005885 Reviewed-by: Dylan Reid <dgreid@chromium.org> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index 9c24797e8039..a46834d89f9a 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -1345,7 +1345,7 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \ #endif /* CONFIG_HYPERV */ idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK -idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK +idtentry int3 do_int3 has_error_code=0 idtentry stack_segment do_stack_segment has_error_code=1 #ifdef CONFIG_XEN idtentry xen_debug do_debug has_error_code=0 diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index de801f22128a..2097c8efc437 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -334,7 +334,6 @@ exit: } NOKPROBE_SYMBOL(do_general_protection); -/* May run on IST stack. */ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code) { enum ctx_state prev_state; @@ -367,15 +366,9 @@ dotraplinkage void notrace do_int3(struct pt_regs *regs, long error_code) SIGTRAP) == NOTIFY_STOP) goto exit; - /* - * Let others (NMI) know that the debug stack is in use - * as we may switch to the interrupt stack. - */ - debug_stack_usage_inc(); preempt_conditional_sti(regs); do_trap(X86_TRAP_BP, SIGTRAP, "int3", regs, error_code, NULL); preempt_conditional_cli(regs); - debug_stack_usage_dec(); exit: exception_exit(prev_state); }
,
May 22 2018
Actually, I suspect that this patch is interfering with some previous security patch which we created or backported. At this point I invoke the help of the security folks.
,
May 22 2018
It looks like Guenter did the backport and may have done some of the other security backports -- any ideas?
,
May 22 2018
Also, did you try to revert just that one commit and see if it starts working?
,
May 22 2018
Unfortunate but not too surprising. I agree with Kees - probably something essential is missing from the backport. If reverting this patch makes it work, I would suggest to go with it. Otherwise, we'll have to dig into it.
,
May 22 2018
#7 yes, reverting this on ToT is possible and makes the bug go away. Thanks! Revert at https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1069421
,
Jun 26 2018
This was fixed with a revert. Please see issue 831269 (restricted). |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by sonnyrao@chromium.org
, May 22 2018