New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 611968 link

Starred by 4 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Mar 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocked on:
issue 612078



Sign in to add a comment

Samus is crashing repeatedly

Reported by adamrodr...@chromium.org, May 14 2016

Issue description

filing at marcheu's request:

feedback report here:

https://feedback.corp.google.com/#/Report/8693425907

see email thread:

from marcheu:

This is a side effect of enabling zero copy. We can remove the flag in
51 as a workaround.

Either way, please file a bug about this.



Stéphane


>
> On Thu, May 12, 2016 at 9:55 AM, Sameer Nanda <snanda@google.com> wrote:
>>
>> +Bernie Thompson regarding whether we are seeing lots of crash reports for
>> Samus on M51.
>>
>> On Thu, May 12, 2016 at 9:53 AM Sameer Nanda <snanda@google.com> wrote:
>>>
>>> Adam, was this feedback filed on a fresh boot or while you were seeing
>>> the crashes?
>>>
>>>
>>> +Stéphane Marchesin : there are whole bunch of i915 warnings in the logs.
>>> Not sure if they are related to the issues that Adam is seeing.
>>>
>>>
>>> [   27.692445] WARNING: CPU: 3 PID: 1871 at
>>> /mnt/host/source/src/third_party/kernel/v3.14/drivers/gpu/drm/i915/i915_gem.c:5125
>>> i915_gem_obj_to_ggtt+0x4b/0x4f()
>>> [   27.692455] Modules linked in: ctr ccm rfcomm evdi uinput i2c_dev cmac
>>> x86_pkg_temp_thermal snd_soc_sst_bdw_rt5677_mach memc_x86 aesni_intel
>>> snd_hda_codec_hdmi iwlmvm aes_x86_64 glue_helper lrw gf128mul ablk_helper
>>> cryptd iwl7000_mac80211 zram iwlwifi snd_hda_intel snd_hda_controller
>>> snd_soc_rt5677 snd_hda_codec snd_soc_sst_haswell_pcm snd_hwdep acpi_als
>>> snd_soc_sst_dsp snd_soc_rl6231 snd_soc_rt5677_spi
>>> industrialio_triggered_buffer snd_soc_sst_acpi fuse cfg80211 btusb btbcm
>>> btintel bluetooth nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter
>>> ip6_tables iio_trig_sysfs uvcvideo videobuf2_vmalloc cros_ec_accel kfifo_buf
>>> industrialio joydev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq
>>> snd_seq_device ppp_async ppp_generic slhc tun
>>> [   27.692565] CPU: 3 PID: 1871 Comm: CompositorTileW Tainted: G        W
>>> 3.14.0 #1
>>> [   27.692572] Hardware name: GOOGLE Samus, BIOS Google_Samus.6300.174.0
>>> 04/02/2015
>>> [   27.692578]  0000000000000000 0000000072f78f30 ffff88046557bde8
>>> ffffffffa21a4cf3
>>> [   27.692587]  0000000000000000 ffff88046557be20 ffffffffa1c3df6b
>>> ffffffffa1eeb7e4
>>> [   27.692596]  ffff8803d95be900 0000000000000000 ffff88046bdd07c8
>>> 0000000040086200
>>> [   27.692606] Call Trace:
>>> [   27.692613]  [<ffffffffa21a4cf3>] dump_stack+0x4d/0x6f
>>> [   27.692621]  [<ffffffffa1c3df6b>] warn_slowpath_common+0x7f/0x98
>>> [   27.692627]  [<ffffffffa1eeb7e4>] ? i915_gem_obj_to_ggtt+0x4b/0x4f
>>> [   27.692634]  [<ffffffffa1c3e07d>] warn_slowpath_null+0x1a/0x1c
>>> [   27.692639]  [<ffffffffa1eeb7e4>] i915_gem_obj_to_ggtt+0x4b/0x4f
>>> [   27.692645]  [<ffffffffa1eeb8c8>]
>>> i915_gem_object_set_to_gtt_domain+0xe0/0x113
>>> [   27.692653]  [<ffffffffa1f3b548>] i915_gem_end_cpu_access+0x2e/0x42
>>> [   27.692662]  [<ffffffffa1f6d5b3>] dma_buf_end_cpu_access+0x3f/0x44
>>> [   27.692669]  [<ffffffffa1f6d8ac>] dma_buf_ioctl+0x8d/0xc5
>>> [   27.692675]  [<ffffffffa1d1c708>] do_vfs_ioctl+0x355/0x416
>>> [   27.692681]  [<ffffffffa1d24ec7>] ? __fget+0x6f/0x79
>>> [   27.692686]  [<ffffffffa1d1c820>] SyS_ioctl+0x57/0x79
>>> [   27.692693]  [<ffffffffa21aa09c>] system_call_fastpath+0x20/0x25
>>> [   27.692759] ---[ end trace 60f7c0d67adde196 ]---
>>> [   27.890333] ------------[ cut here ]------------
>>>
>>> On Wed, May 11, 2016 at 9:33 PM Adam Rodriguez <adamrodriguez@google.com>
>>> wrote:
>>>>
>>>> are we seeing a lot of crash reports? my device is crashing every few
>>>> minutes...
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: <adamrodriguez@google.com>
>>>> Date: Thu, May 12, 2016 at 8:36 AM
>>>> Subject: samus is crashing repeatedly. every minute or so...help!
>>>> To: samus-feedback-reports@google.com
>>>>
>>>>
>>>> https://feedback.corp.google.com/#/Report/8693425907
>>>>
>>>> Description:
>>>> samus is crashing repeatedly. every  minute or so...help!
 
Cc: tiago.vi...@intel.com dongseon...@intel.com
Dongseong & Tiago: do you know if we can quickly fix that, or if we should just remove the flag? It seems like we might be running out of GTT space...
Cc: h...@chromium.org
hmm I've thought that hshi@ fixed this warning in 3.14:
https://chromium-review.googlesource.com/#/c/340480/

How can we make sure M51 has his fix? Can you check please snanda@? Otherwise it will be safer to disable native pixmaps for now for the recent IA builds. Besides, we're already working on another issue in this native pixmap configuration (crbug.com/612078), which seems unrelated with the particular error here.
hmm you're right it looks like this isn't in M51.
well I backported the fix for the warning to 51, but I don't think it'll fix the crashes.
Project Member

Comment 5 by bugdroid1@chromium.org, May 16 2016

Labels: merge-merged-release-R51-8172.B-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/ca80bd54462f88af574e994b8532eb176de5edc2

commit ca80bd54462f88af574e994b8532eb176de5edc2
Author: Haixia Shi <hshi@chromium.org>
Date: Sat Apr 23 01:02:50 2016

drm/i915: fix kernel WARN in i915_gem_obj_to_ggtt

The kernel WARN started at commit 6b1c085a9cff
"BACKPORT: drm/i915: Broaden application of set-domain(GTT)"

BUG= chromium:605774 , chromium:611968 
TEST=verify no more warnings
Signed-off-by: Haixia Shi <hshi@chromium.org>

Change-Id: Id49d2d22e0b9f6bacb518c134f73dae2e242c3e0
Reviewed-on: https://chromium-review.googlesource.com/340480
Commit-Ready: Haixia Shi <hshi@chromium.org>
Tested-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
(cherry picked from commit 3873117673b8f3872410d6f0bb3a3150d5901730)
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/344910
Reviewed-by: Haixia Shi <hshi@chromium.org>

[modify] https://crrev.com/ca80bd54462f88af574e994b8532eb176de5edc2/drivers/gpu/drm/i915/i915_gem.c

The log is same to the log in  Issue 605774 , so #5 should fix the warning.
And (hopefully) the crash is not related to GBM.
snanda@, could you check the crash still happens?

If the crash still happens, I'll comment out the synchronization logic. https://codereview.chromium.org/1906253003/
It's because zero-copy was enabled in M50 and crash bugs were reported only recently. Recent change is the synchronization, which is directly related to i915_gem_end_cpu_access.

I was the one with all the crashes...is there any testing you need me to do? I've since upgraded to:


Version 51.0.2704.42 beta (64-bit)
Platform 8172.28.0 (Official Build) beta-channel samus
Firmware Google_Samus.6300.174.0

but pretty sure I'm still getting crashes, will file feedback next time it happens.
aaaand it crashed. need the report?
#8 - I want you to check ToT including #5 change, rather than beta.

I have Pixel 2015 but I could not reproduce it. How can I reproduce it?
Issue 612078 has been merged into this issue.
Re #8: Did you submit a report? If not, would you mind submitting one with system logs? (Hopefully the kernel log provides more details)
this work? https://feedback.corp.google.com/#/Report/8851682535

Description:
crash again snanda@

UI language: en-US

Product Specific Data (whitelisted):
CHROME VERSION: 51.0.2704.42 beta
CHROMEOS_AUSERVER: <URL: 4>
CHROMEOS_RELEASE_BOARD: samus-signed-mpkeys
CHROMEOS_RELEASE_DESCRIPTION: 8172.28.0 (Official Build) beta-channel samus
CHROMEOS_RELEASE_TRACK: beta-channel
CHROMEOS_RELEASE_VERSION: 8172.28.0
ENTERPRISE_ENROLLED: Managed
we don't have access to this report adamrodriguez@. Can you maybe paste here?
I'm still failing to reproduce it although I tried beta, dev and ToT using Pixel 2015.
Could you share how to reproduce it?
I don't have repro steps. the device just starts to get slow, the cursor jumps around and it dies.  I believe marcheu knows the root cause
I just tried Chrome ToT on Samus and couldn't reproduce either. Can you apply the following and see if the problem persists:
https://codereview.chromium.org/1906253003/
I've played youtube video, webgl apps and many other tabs for 24 hours on Pixel 2015 using M51 beta. I couldn't reproduce it.
I did intensive tests using 55 tabs of youtube video using M51 beta on Pixel 2015. In the test, GPU process creates about 1100 of anon_inode:dmabuf.
I cannot encounter mmap failure or crash due to dmabuf.

On the other hands, no matter whether using --enable-native-gpu-memory-buffers or not, 55 youtubes can kill the device when ~400MB virtual memory space remain. I got the 'vmstat' log right before rebooting. It kills device itself, so chrome process doesn't print meaningful log.
localhost ~ # vmstat -S m                                                        
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
44  2      0    381     65   1303    0    0   185   194 2628 7297 65  8 22  5  0

In the future, ChromeOS needs "elegant tab killing" mechanism because the device cannot swap memory.

FYI, paste 'vmstat' right after booting. Pixel 2015 has 8GB memory space.
localhost ~ # vmstat -S m                                                       
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0   7596     35    371    0    0   284     5  118  283  1  1 93  4  0
FYI, same kernel crash reported on Buddy (BDW) on M51.

https://feedback.corp.google.com/#/Report/9030923815
#19 yungleem@, thx for pointing out, but external contributors don't have access permission. Could you paste the log to here?
Blockedon: 612078
Status: Fixed (was: Assigned)
This shouldn't happen any more.

Comment 23 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61
Status: Verified (was: Fixed)
Not reproducible in Chrome OS 9765.13.0, 61.0.3163.20. 

Sign in to add a comment