Samus is crashing repeatedly
Reported by
adamrodr...@chromium.org,
May 14 2016
|
||||||||
Issue descriptionfiling at marcheu's request: feedback report here: https://feedback.corp.google.com/#/Report/8693425907 see email thread: from marcheu: This is a side effect of enabling zero copy. We can remove the flag in 51 as a workaround. Either way, please file a bug about this. Stéphane > > On Thu, May 12, 2016 at 9:55 AM, Sameer Nanda <snanda@google.com> wrote: >> >> +Bernie Thompson regarding whether we are seeing lots of crash reports for >> Samus on M51. >> >> On Thu, May 12, 2016 at 9:53 AM Sameer Nanda <snanda@google.com> wrote: >>> >>> Adam, was this feedback filed on a fresh boot or while you were seeing >>> the crashes? >>> >>> >>> +Stéphane Marchesin : there are whole bunch of i915 warnings in the logs. >>> Not sure if they are related to the issues that Adam is seeing. >>> >>> >>> [ 27.692445] WARNING: CPU: 3 PID: 1871 at >>> /mnt/host/source/src/third_party/kernel/v3.14/drivers/gpu/drm/i915/i915_gem.c:5125 >>> i915_gem_obj_to_ggtt+0x4b/0x4f() >>> [ 27.692455] Modules linked in: ctr ccm rfcomm evdi uinput i2c_dev cmac >>> x86_pkg_temp_thermal snd_soc_sst_bdw_rt5677_mach memc_x86 aesni_intel >>> snd_hda_codec_hdmi iwlmvm aes_x86_64 glue_helper lrw gf128mul ablk_helper >>> cryptd iwl7000_mac80211 zram iwlwifi snd_hda_intel snd_hda_controller >>> snd_soc_rt5677 snd_hda_codec snd_soc_sst_haswell_pcm snd_hwdep acpi_als >>> snd_soc_sst_dsp snd_soc_rl6231 snd_soc_rt5677_spi >>> industrialio_triggered_buffer snd_soc_sst_acpi fuse cfg80211 btusb btbcm >>> btintel bluetooth nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter >>> ip6_tables iio_trig_sysfs uvcvideo videobuf2_vmalloc cros_ec_accel kfifo_buf >>> industrialio joydev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq >>> snd_seq_device ppp_async ppp_generic slhc tun >>> [ 27.692565] CPU: 3 PID: 1871 Comm: CompositorTileW Tainted: G W >>> 3.14.0 #1 >>> [ 27.692572] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0 >>> 04/02/2015 >>> [ 27.692578] 0000000000000000 0000000072f78f30 ffff88046557bde8 >>> ffffffffa21a4cf3 >>> [ 27.692587] 0000000000000000 ffff88046557be20 ffffffffa1c3df6b >>> ffffffffa1eeb7e4 >>> [ 27.692596] ffff8803d95be900 0000000000000000 ffff88046bdd07c8 >>> 0000000040086200 >>> [ 27.692606] Call Trace: >>> [ 27.692613] [<ffffffffa21a4cf3>] dump_stack+0x4d/0x6f >>> [ 27.692621] [<ffffffffa1c3df6b>] warn_slowpath_common+0x7f/0x98 >>> [ 27.692627] [<ffffffffa1eeb7e4>] ? i915_gem_obj_to_ggtt+0x4b/0x4f >>> [ 27.692634] [<ffffffffa1c3e07d>] warn_slowpath_null+0x1a/0x1c >>> [ 27.692639] [<ffffffffa1eeb7e4>] i915_gem_obj_to_ggtt+0x4b/0x4f >>> [ 27.692645] [<ffffffffa1eeb8c8>] >>> i915_gem_object_set_to_gtt_domain+0xe0/0x113 >>> [ 27.692653] [<ffffffffa1f3b548>] i915_gem_end_cpu_access+0x2e/0x42 >>> [ 27.692662] [<ffffffffa1f6d5b3>] dma_buf_end_cpu_access+0x3f/0x44 >>> [ 27.692669] [<ffffffffa1f6d8ac>] dma_buf_ioctl+0x8d/0xc5 >>> [ 27.692675] [<ffffffffa1d1c708>] do_vfs_ioctl+0x355/0x416 >>> [ 27.692681] [<ffffffffa1d24ec7>] ? __fget+0x6f/0x79 >>> [ 27.692686] [<ffffffffa1d1c820>] SyS_ioctl+0x57/0x79 >>> [ 27.692693] [<ffffffffa21aa09c>] system_call_fastpath+0x20/0x25 >>> [ 27.692759] ---[ end trace 60f7c0d67adde196 ]--- >>> [ 27.890333] ------------[ cut here ]------------ >>> >>> On Wed, May 11, 2016 at 9:33 PM Adam Rodriguez <adamrodriguez@google.com> >>> wrote: >>>> >>>> are we seeing a lot of crash reports? my device is crashing every few >>>> minutes... >>>> >>>> ---------- Forwarded message ---------- >>>> From: <adamrodriguez@google.com> >>>> Date: Thu, May 12, 2016 at 8:36 AM >>>> Subject: samus is crashing repeatedly. every minute or so...help! >>>> To: samus-feedback-reports@google.com >>>> >>>> >>>> https://feedback.corp.google.com/#/Report/8693425907 >>>> >>>> Description: >>>> samus is crashing repeatedly. every minute or so...help!
,
May 16 2016
hmm I've thought that hshi@ fixed this warning in 3.14: https://chromium-review.googlesource.com/#/c/340480/ How can we make sure M51 has his fix? Can you check please snanda@? Otherwise it will be safer to disable native pixmaps for now for the recent IA builds. Besides, we're already working on another issue in this native pixmap configuration (crbug.com/612078), which seems unrelated with the particular error here.
,
May 16 2016
hmm you're right it looks like this isn't in M51.
,
May 16 2016
well I backported the fix for the warning to 51, but I don't think it'll fix the crashes.
,
May 16 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/ca80bd54462f88af574e994b8532eb176de5edc2 commit ca80bd54462f88af574e994b8532eb176de5edc2 Author: Haixia Shi <hshi@chromium.org> Date: Sat Apr 23 01:02:50 2016 drm/i915: fix kernel WARN in i915_gem_obj_to_ggtt The kernel WARN started at commit 6b1c085a9cff "BACKPORT: drm/i915: Broaden application of set-domain(GTT)" BUG= chromium:605774 , chromium:611968 TEST=verify no more warnings Signed-off-by: Haixia Shi <hshi@chromium.org> Change-Id: Id49d2d22e0b9f6bacb518c134f73dae2e242c3e0 Reviewed-on: https://chromium-review.googlesource.com/340480 Commit-Ready: Haixia Shi <hshi@chromium.org> Tested-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> (cherry picked from commit 3873117673b8f3872410d6f0bb3a3150d5901730) Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/344910 Reviewed-by: Haixia Shi <hshi@chromium.org> [modify] https://crrev.com/ca80bd54462f88af574e994b8532eb176de5edc2/drivers/gpu/drm/i915/i915_gem.c
,
May 17 2016
The log is same to the log in Issue 605774 , so #5 should fix the warning. And (hopefully) the crash is not related to GBM. snanda@, could you check the crash still happens? If the crash still happens, I'll comment out the synchronization logic. https://codereview.chromium.org/1906253003/ It's because zero-copy was enabled in M50 and crash bugs were reported only recently. Recent change is the synchronization, which is directly related to i915_gem_end_cpu_access.
,
May 17 2016
I was the one with all the crashes...is there any testing you need me to do? I've since upgraded to: Version 51.0.2704.42 beta (64-bit) Platform 8172.28.0 (Official Build) beta-channel samus Firmware Google_Samus.6300.174.0 but pretty sure I'm still getting crashes, will file feedback next time it happens.
,
May 17 2016
aaaand it crashed. need the report?
,
May 17 2016
#8 - I want you to check ToT including #5 change, rather than beta. I have Pixel 2015 but I could not reproduce it. How can I reproduce it?
,
May 17 2016
Issue 612078 has been merged into this issue.
,
May 18 2016
Re #8: Did you submit a report? If not, would you mind submitting one with system logs? (Hopefully the kernel log provides more details)
,
May 18 2016
this work? https://feedback.corp.google.com/#/Report/8851682535 Description: crash again snanda@ UI language: en-US Product Specific Data (whitelisted): CHROME VERSION: 51.0.2704.42 beta CHROMEOS_AUSERVER: <URL: 4> CHROMEOS_RELEASE_BOARD: samus-signed-mpkeys CHROMEOS_RELEASE_DESCRIPTION: 8172.28.0 (Official Build) beta-channel samus CHROMEOS_RELEASE_TRACK: beta-channel CHROMEOS_RELEASE_VERSION: 8172.28.0 ENTERPRISE_ENROLLED: Managed
,
May 18 2016
we don't have access to this report adamrodriguez@. Can you maybe paste here?
,
May 18 2016
I'm still failing to reproduce it although I tried beta, dev and ToT using Pixel 2015. Could you share how to reproduce it?
,
May 18 2016
I don't have repro steps. the device just starts to get slow, the cursor jumps around and it dies. I believe marcheu knows the root cause
,
May 18 2016
I just tried Chrome ToT on Samus and couldn't reproduce either. Can you apply the following and see if the problem persists: https://codereview.chromium.org/1906253003/
,
May 19 2016
I've played youtube video, webgl apps and many other tabs for 24 hours on Pixel 2015 using M51 beta. I couldn't reproduce it.
,
May 19 2016
I did intensive tests using 55 tabs of youtube video using M51 beta on Pixel 2015. In the test, GPU process creates about 1100 of anon_inode:dmabuf. I cannot encounter mmap failure or crash due to dmabuf. On the other hands, no matter whether using --enable-native-gpu-memory-buffers or not, 55 youtubes can kill the device when ~400MB virtual memory space remain. I got the 'vmstat' log right before rebooting. It kills device itself, so chrome process doesn't print meaningful log. localhost ~ # vmstat -S m procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 44 2 0 381 65 1303 0 0 185 194 2628 7297 65 8 22 5 0 In the future, ChromeOS needs "elegant tab killing" mechanism because the device cannot swap memory. FYI, paste 'vmstat' right after booting. Pixel 2015 has 8GB memory space. localhost ~ # vmstat -S m procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 7596 35 371 0 0 284 5 118 283 1 1 93 4 0
,
May 23 2016
FYI, same kernel crash reported on Buddy (BDW) on M51. https://feedback.corp.google.com/#/Report/9030923815
,
May 24 2016
#19 yungleem@, thx for pointing out, but external contributors don't have access permission. Could you paste the log to here?
,
Jun 29 2016
,
Mar 22 2017
This shouldn't happen any more.
,
May 30 2017
,
Aug 1 2017
,
Aug 1 2017
Not reproducible in Chrome OS 9765.13.0, 61.0.3163.20. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by marc...@chromium.org
, May 16 2016