New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 597351 link

Starred by 3 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: May 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug
DRM
Gfx



Sign in to add a comment

[kernel_usb] peppy: Kernel Crash when suspending - externalUsbPeripherals test fails with "client rebooted, but sleep was expected"

Project Member Reported by ka...@chromium.org, Mar 23 2016

Issue description

Build 51-8101.0.0

Dashboard screenshot: https://screenshot.googleplex.com/UUS1Pix2N7W
Dashboar failure URL: https://wmatrix.googleplex.com/failures/unfiltered?suites=kernel_usb&builds=R51-8101.0.0&releases=51&platforms=peppy

Crash log at https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/57634317-chromeos-test/chromeos1-row1-rack4-host2/crashinfo.chromeos1-row1-rack4-host2/

<1>[  199.555594] BUG: unable to handle kernel NULL pointer dereference at           (null)
<1>[  199.555638] IP: [<          (null)>]           (null)
<5>[  199.555658] PGD 0 
<5>[  199.555674] Oops: 0010 [#1] SMP 
<0>[  199.558074] gsmi: Log Shutdown Reason 0x03
<5>[  199.558099] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat i2c_dev uinput rfcomm memconsole snd_hda_codec_realtek snd_hda_codec_hdmi isl29018(C) industrialio zram(C) zsmalloc(C) snd_hda_intel snd_hda_codec fuse sr_mod cdrom option usb_wwan snd_usb_audio snd_usbmidi_lib snd_hwdep snd_pcm snd_page_alloc asix nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables smsc95xx usbnet ath9k_btcoex ath9k_common_btcoex ath3k ath9k_hw_btcoex btusb ath btrtl btbcm mac80211 btintel bluetooth cfg80211 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core joydev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer ppp_async ppp_generic slhc tun
<5>[  199.558614] CPU 1 
<5>[  199.558627] Pid: 31602, comm: kworker/u:9 Tainted: G        WC   3.8.11 #1
<5>[  199.558649] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
<5>[  199.558680] RSP: 0000:ffff8800734bdb80  EFLAGS: 00010246
<5>[  199.558696] RAX: ffff880075690000 RBX: 0000000000000000 RCX: 0000000000000040
<5>[  199.558726] RDX: 0000000000000000 RSI: ffff88007569a5c0 RDI: ffff880075643300
<5>[  199.558744] RBP: ffff8800734bdc28 R08: 0000000000000000 R09: ffff8800734bdbd8
<5>[  199.558761] R10: ffff8800734bdbb0 R11: 0000000000000000 R12: 0000000000000000
<5>[  199.558779] R13: ffff880075690000 R14: ffff88010004f038 R15: ffff880075643300
<5>[  199.558810] FS:  0000000000000000(0000) GS:ffff880100300000(0000) knlGS:0000000000000000
<5>[  199.558830] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<5>[  199.558848] CR2: 0000000000000000 CR3: 0000000030c0c000 CR4: 00000000000407e0
<5>[  199.558865] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<5>[  199.558883] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<5>[  199.558903] Process kworker/u:9 (pid: 31602, threadinfo ffff8800734bc000, task ffff8800549eb480)
<5>[  199.558934] Stack:
<5>[  199.558944]  ffffffffb06e05c9 00000000b06bf938 0000000000000000 0000000000000000
<5>[  199.558993]  ffff88010004f000 0000000000000000 ffff8800549e1c00 0000000000000000
<5>[  199.559043]  ffff88005657b6c0 0000000000000000 0000000000000000 0000000000000000
<5>[  199.559093] Call Trace:
<5>[  199.559116]  [<ffffffffb06e05c9>] ? intel_update_plane+0x572/0x66b
<5>[  199.559139]  [<ffffffffb06e10dc>] intel_plane_restore+0x57/0x5d
<5>[  199.559160]  [<ffffffffb06c26d3>] intel_modeset_setup_hw_state+0x35a/0x493
<5>[  199.559187]  [<ffffffffb06971d1>] __i915_drm_thaw+0x13f/0x1b1
<5>[  199.559208]  [<ffffffffb06976d4>] i915_resume+0x8c/0xa5
<5>[  199.559227]  [<ffffffffb0697703>] i915_pm_resume+0x16/0x18
<5>[  199.559247]  [<ffffffffb0604bc3>] pci_pm_resume+0xc4/0xeb
<5>[  199.559266]  [<ffffffffb0604aff>] ? pci_pm_prepare+0x40/0x40
<5>[  199.559287]  [<ffffffffb06f5075>] dpm_run_callback.isra.3+0x2e/0x83
<5>[  199.559307]  [<ffffffffb06f51d5>] device_resume+0x10b/0x14d
<5>[  199.559326]  [<ffffffffb06f5234>] async_resume+0x1d/0x43
<5>[  199.559348]  [<ffffffffb0457a64>] async_run_entry_fn+0xc1/0x1a3
<5>[  199.559370]  [<ffffffffb044b034>] process_one_work+0x18a/0x2af
<5>[  199.559390]  [<ffffffffb044d283>] worker_thread+0x135/0x1fb
<5>[  199.559410]  [<ffffffffb044d14e>] ? flush_delayed_work+0x3e/0x3e
<5>[  199.559431]  [<ffffffffb0450e51>] kthread+0xc0/0xc8
<5>[  199.559449]  [<ffffffffb0450d91>] ? __kthread_parkme+0x6b/0x6b
<5>[  199.559476]  [<ffffffffb08c645c>] ret_from_fork+0x7c/0xb0
<5>[  199.559494]  [<ffffffffb0450d91>] ? __kthread_parkme+0x6b/0x6b
<5>[  199.559510] Code:  Bad RIP value.
<1>[  199.559549] RIP  [<          (null)>]           (null)
<5>[  199.559569]  RSP <ffff8800734bdb80>
<5>[  199.559588] CR2: 0000000000000000
<4>[  199.559625] ---[ end trace 245689efa4d903db ]---
 

Comment 1 by ka...@chromium.org, Mar 23 2016

Similar scenario(suspend/resume) reproducing kernel crash on peppy is in  issue 597131 , but at M51 - 8097.0.0, and different crash signature

Comment 2 by ka...@chromium.org, Mar 23 2016

falco started failing on R51-8077.0.0 and issue 596926 was raised yesterday. Same crash signature.

Comment 3 by ka...@chromium.org, Mar 23 2016

On a re-run the test passed at https://ubercautotest.corp.google.com/afe/#tab_id=view_job&object_id=57641419

Manually tested peppy board with 51-8101.0.0 and suspend-resume passed. suspend_stress_test also is going on uninterrupted. Same when external display is connected.


Comment 4 by ka...@chromium.org, Mar 23 2016

Components: Test

Comment 6 by h...@chromium.org, May 4 2016

Cc: h...@chromium.org

Comment 7 by h...@chromium.org, May 4 2016

Cc: bhthompson@chromium.org mshe...@chromium.org marc...@chromium.org
 Issue 599202  has been merged into this issue.

Comment 8 by h...@chromium.org, May 4 2016

Owner: h...@chromium.org
Status: Assigned (was: Untriaged)
Taking a look. Seems to be kernel 3.8 only.

Comment 9 by h...@chromium.org, May 4 2016

code snippet around the crash point (intel_update_plane + 0x572)

000000000000022a <intel_update_plane>:
     22a:       e8 00 00 00 00          callq  22f <intel_update_plane+0x5>
     22f:       55                      push   %rbp  
     230:       48 89 e5                mov    %rsp,%rbp
     233:       41 57                   push   %r15  
     235:       41 56                   push   %r14  
     ...
     ...
     74a:       85 db                   test   %ebx,%ebx
     74c:       75 08                   jne    756 <intel_update_plane+0x52c>
     74e:       4c 89 ef                mov    %r13,%rdi
     751:       e8 12 fa ff ff          callq  168 <intel_enable_primary>
     756:       45 84 e4                test   %r12b,%r12b
     759:       74 37                   je     792 <intel_update_plane+0x568>
     75b:       8b 45 30                mov    0x30(%rbp),%eax
     75e:       44 8b 4d 94             mov    -0x6c(%rbp),%r9d
     762:       4c 89 ff                mov    %r15,%rdi
     765:       44 8b 45 84             mov    -0x7c(%rbp),%r8d
     769:       8b 4d 90                mov    -0x70(%rbp),%ecx
     76c:       48 8b 55 88             mov    -0x78(%rbp),%rdx
     770:       48 8b 75 98             mov    -0x68(%rbp),%rsi
     774:       50                      push   %rax
     775:       8b 45 28                mov    0x28(%rbp),%eax
     778:       50                      push   %rax
     779:       8b 45 20                mov    0x20(%rbp),%eax
     77c:       50                      push   %rax
     77d:       8b 45 18                mov    0x18(%rbp),%eax
     780:       50                      push   %rax
     781:       8b 45 80                mov    -0x80(%rbp),%eax
     784:       50                      push   %rax
     785:       41 ff 97 30 31 00 00    callq  *0x3130(%r15)
     78c:       48 83 c4 28             add    $0x28,%rsp
     790:       eb 0a                   jmp    79c <intel_update_plane+0x572>
     792:       4c 89 ff                mov    %r15,%rdi
     795:       41 ff 97 38 31 00 00    callq  *0x3138(%r15)
     79c:       85 db                   test   %ebx,%ebx                       <============= return address here
     79e:       74 5c                   je     7fc <intel_update_plane+0x5d2>

Comment 10 by h...@chromium.org, May 4 2016

Cross-referencing source code in intel_sprite.c, the NULL reference corresponds to the call to intel_plane->disable_plane:

        /*
         * Be sure to re-enable the primary before the sprite is no longer
         * covering it fully.
         */
        if (!disable_primary)
                intel_enable_primary(crtc);

        if (visible)
                intel_plane->update_plane(plane, fb, obj,
                                          crtc_x, crtc_y, crtc_w, crtc_h,
                                          src_x, src_y, src_w, src_h);
        else
                intel_plane->disable_plane(plane);         <====== this call

Comment 11 by h...@chromium.org, May 5 2016

This is already fixed at TOT. See https://chromium-review.googlesource.com/#/c/337408/

Revert "BACKPORT: drm/i915: restore cursor and sprite state when forcing a config restore v2"

This reverts commit 21e91aa6954d8c0ace44556c6560b25170432e7c.

We're seeing crashes on resume in some cases due to this patch series.

BUG= 597131 
TEST=suspend/resume on peppy

Comment 12 by h...@chromium.org, May 5 2016

Status: Fixed (was: Assigned)
Status: Verified (was: Fixed)
Verified test no longer failed 
Cc: -mshe...@chromium.org

Sign in to add a comment