New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 873822 link

Starred by 5 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 23
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Bob/Kevin warning spam: drivers/gpu/drm/drm_atomic_helper.c:1343 drm_atomic_helper_prepare_planes+0x124/0x478

Project Member Reported by jwer...@chromium.org, Aug 13

Issue description

The Bobs in the lab are all doing this 10+ times a second:

2018-08-13T21:39:37.922544+00:00 EMERG kernel: [ 1720.179706] Call trace:
2018-08-13T21:39:37.922550+00:00 WARNING kernel: [ 1720.179719] [<ffffffc00081c290>] drm_atomic_helper_prepare_planes+0x124/0x478
2018-08-13T21:39:37.922556+00:00 WARNING kernel: [ 1720.179737] [<ffffffc0005fedac>] rockchip_drm_atomic_commit+0x38/0x18c
2018-08-13T21:39:37.922561+00:00 WARNING kernel: [ 1720.179753] [<ffffffc0005eeb6c>] drm_atomic_nonblocking_commit+0x58/0x64
2018-08-13T21:39:37.922567+00:00 WARNING kernel: [ 1720.179767] [<ffffffc0005efae4>] drm_mode_atomic_ioctl+0x90c/0xb24
2018-08-13T21:39:37.922573+00:00 WARNING kernel: [ 1720.179780] [<ffffffc00081d5c0>] drm_ioctl+0x1f0/0x418
2018-08-13T21:39:37.922579+00:00 WARNING kernel: [ 1720.179792] [<ffffffc0005f6504>] drm_compat_ioctl+0x3c/0x9c
2018-08-13T21:39:37.922584+00:00 WARNING kernel: [ 1720.179809] [<ffffffc0003d4d84>] compat_SyS_ioctl+0x464/0x2410
2018-08-13T21:39:37.922590+00:00 WARNING kernel: [ 1720.179823] [<ffffffc000203e60>] __sys_trace_return+0x0/0x4
2018-08-13T21:39:37.965664+00:00 WARNING kernel: [ 1720.221777] WARNING: CPU: 5 PID: 17970 at ../../../../../tmp/portage/sys-kernel/chromeos-kernel-4_4-4.4.147-
r1634/work/chromeos-kernel-4_4-4.4.147/drivers/gpu/drm/drm_atomic_helper.c:1343 drm_atomic_helper_prepare_planes+0x124/0x478
2018-08-13T21:39:37.965724+00:00 WARNING kernel: [ 1720.221791] Modules linked in: veth esp6 ah6 xfrm6_mode_tunnel xfrm6_mode_transport xfrm4_mode_tunnel xfrm4_
mode_transport ip6t_REJECT nf_reject_ipv6 ip6t_ipv6header nls_iso8859_1 nls_cp437 vfat fat rfcomm cmac btusb btrtl btbcm btintel bluetooth uinput uvcvideo video
buf2_vmalloc mwifiex_pcie mwifiex zram bridge stp llc ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_mark fuse snd_seq_dummy snd_seq snd_seq_device cfg80211 ip6table_
filter cdc_ether usbnet r8152 mii joydev
2018-08-13T21:39:37.965733+00:00 WARNING kernel: [ 1720.221915] 
2018-08-13T21:39:37.965770+00:00 WARNING kernel: [ 1720.221923] CPU: 5 PID: 17970 Comm: DrmThread Tainted: G        W       4.4.147-14692-g7bf2f3d23780 #1
2018-08-13T21:39:37.965778+00:00 WARNING kernel: [ 1720.221927] Hardware name: Google Bob (DT)
2018-08-13T21:39:37.965784+00:00 WARNING kernel: [ 1720.221932] task: ffffffc09c4b3400 task.stack: ffffffc0e7b8c000
2018-08-13T21:39:37.965791+00:00 WARNING kernel: [ 1720.221936] PC is at drm_atomic_helper_prepare_planes+0x124/0x478
2018-08-13T21:39:37.965797+00:00 WARNING kernel: [ 1720.221941] LR is at drm_atomic_helper_prepare_planes+0xa4/0x478
2018-08-13T21:39:37.965803+00:00 WARNING kernel: [ 1720.221945] pc : [<ffffffc00081c290>] lr : [<ffffffc00081c210>] pstate: 80000145
2018-08-13T21:39:37.965810+00:00 WARNING kernel: [ 1720.221948] sp : ffffffc0e7b8fa60
2018-08-13T21:39:37.965816+00:00 WARNING kernel: [ 1720.221952] x29: ffffffc0e7b8fae0 x28: 0000000000000000 
2018-08-13T21:39:37.965821+00:00 WARNING kernel: [ 1720.221959] x27: 0000000000000018 x26: 0000000000000006 
2018-08-13T21:39:37.965827+00:00 WARNING kernel: [ 1720.221965] x25: ffffffc0ed7f1028 x24: ffffffc000bb1b2c 
2018-08-13T21:39:37.965833+00:00 WARNING kernel: [ 1720.221973] x23: ffffffc0ed6ad000 x22: 0000000000000002 
2018-08-13T21:39:37.965839+00:00 WARNING kernel: [ 1720.221980] x21: 0000000000000040 x20: ffffffc061ac7400 
2018-08-13T21:39:37.965845+00:00 WARNING kernel: [ 1720.221987] x19: ffffffc0cacb3980 x18: 0000000000000000 
2018-08-13T21:39:37.965850+00:00 WARNING kernel: [ 1720.221994] x17: 0000000000000000 x16: ffffffc0003d4920 
2018-08-13T21:39:37.965856+00:00 WARNING kernel: [ 1720.222000] x15: 0000000000000320 x14: 0000032000000500 
2018-08-13T21:39:37.965862+00:00 WARNING kernel: [ 1720.222007] x13: 00000000000003e8 x12: 0000000000000001 
2018-08-13T21:39:37.965868+00:00 WARNING kernel: [ 1720.222013] x11: ffffffc0cacb3680 x10: ffffffc096067980 
2018-08-13T21:39:37.965874+00:00 WARNING kernel: [ 1720.222020] x9 : ffffffc0eef6c6d8 x8 : ffffffc061ac7700 
2018-08-13T21:39:37.965879+00:00 WARNING kernel: [ 1720.222027] x7 : 0000000000000000 x6 : 000000000000003f 
2018-08-13T21:39:37.965885+00:00 WARNING kernel: [ 1720.222033] x5 : 0000000000000040 x4 : 0000000000000000 
2018-08-13T21:39:37.965891+00:00 WARNING kernel: [ 1720.222040] x3 : 0000000000000004 x2 : 0000000000000000 
2018-08-13T21:39:37.965897+00:00 WARNING kernel: [ 1720.222047] x1 : 0000000000000000 x0 : ffffffc061ac7400 

It's just a warning but it bloats the logs too much to be able to properly debug anything else. It seems to be somewhat recent. Example test that shows the full output (in "messages"): https://stainless.corp.google.com/browse/chromeos-autotest-results/226981927-chromeos-test/

I've also checked one Kevin that showed the same.
 
Components: OS>Kernel>Graphics
edit: This also seems to lead to disk space exhaustion (e.g. see the end of client.0.DEBUG here: https://stainless.corp.google.com/browse/chromeos-autotest-results/227041909-chromeos-test/chromeos6-row4-rack13-host7/ ), so I'd say this is pretty urgent. We have seen several weird failures on bob-paladins which may be caused by this. Bob paladins have temporarily been set to experimental due to this. I think we probably aren't seeing the same test failures on Kevin because Kevin has more disk space (right?).
Cc: dbehr@chromium.org marc...@chromium.org hoegsberg@chromium.org
Owner: marc...@chromium.org
Status: djkurtzchromium.org (was: Untriaged)
Looks like marcheu is on vacation, so we need another owner. This is breaking more random tests due to full disk (e.g. https://stainless.corp.google.com/browse/chromeos-autotest-results/228208606-chromeos-test/) and needs an owner urgently!

Dan, can you look at this or find someone who can?
Owner: djkurtz@chromium.org
Status: Assigned (was: djkurtzchromium.org)
...apparently I can set the Status of a bug to "djkurtz@chromium.org" and Monorail accepts it without second thought... o_O

Comment 7 Deleted

A bisect of the logs for generic_RebootTest seems to suggest these backtraces started occurring in /var/log/messages on bob between R70-10961.0.0 and R70-10962.0.0:

https://stainless.corp.google.com/search?view=list&first_date=2018-08-11&last_date=2018-08-17&suite=%5Ebvt%5C-cq%24&test=generic_RebootTest&board=%5Ebob%24

However, I don't see any relevant patches (eg kernel-4.4) between these two versions:
https://crosland.corp.google.com/log/10961.0.0..10962.0.0


As expected, the kernel has for the two versions is the same, and it was built with the same clang:

R70-10962.0.0:
2018-08-12T18:49:39.634646+00:00 NOTICE kernel: [    0.000000] Linux version 4.4.147-14692-g7bf2f3d23780 (chrome-bot@swarm-cros-594) (Chromium OS 7.0_pre333878_p20180808-r1 clang version 7.0.0 (/var/cache/chromeos-cache/distfiles/host/egit-src/clang.git 38ad3c9160e5814ec8cad29a990cf76730c5f20e) (/var/cache/chromeos-cache/distfiles/host/egit-src/llvm.git 40c66c3d40377cf85640b3a35e6ec5c5b1cbc41f) (based on LLVM 7.0.0svn)) #1 SMP PREEMPT Sun Aug 12 04:06:09 PDT 2018

R70-10961.0.0:
2018-08-12T08:56:40.595940+00:00 NOTICE kernel: [    0.000000] Linux version 4.4.147-14692-g7bf2f3d23780 (chrome-bot@swarm-cros-363) (Chromium OS 7.0_pre333878_p20180808-r1 clang version 7.0.0 (/var/cache/chromeos-cache/distfiles/host/egit-src/clang.git 38ad3c9160e5814ec8cad29a990cf76730c5f20e) (/var/cache/chromeos-cache/distfiles/host/egit-src/llvm.git 40c66c3d40377cf85640b3a35e6ec5c5b1cbc41f) (based on LLVM 7.0.0svn)) #1 SMP PREEMPT Sat Aug 11 20:14:28 PDT 2018

The Android (N) container version is also exactly the same: 	4947328

However, there is a different Chrome: 
R70-10962.0.0: 70.0.3519.3
R70-10961.0.0: 70.0.3511.0

https://chromium.googlesource.com/chromium/src/+log/70.0.3511.0..70.0.3519.3

Cc: a...@chromium.org dcheng@chromium.org piman@chromium.org
Owner: dcasta...@chromium.org
The WARN_ON is coming from:  drm_atomic_add_implicit_fences()


	for_each_plane_in_state(state, plane, plane_state, i) {
		WARN_ON(plane_state->fence);
		/* If fb is not changing or new fb is NULL. */
		if (plane->state->fb == plane_state->fb || !plane_state->fb)
			continue;

		if (!plane_state->fb->funcs->get_reservations)
			continue;

		plane_state->fb->funcs->get_reservations(plane_state->fb, resvs, &num_resvs);
	}


which was added by CHROMIUM patch:

commit 65d84c5d47f74cda2ab9e56c1bde45ebae0f8cc4
Author: Dominik Behr <dbehr@chromium.org>
Date:   Mon Jul 11 12:35:16 2016 -0700
    CHROMIUM: drm: add implicit in-fences to atomic V2


These WARNINGs do not occur with frecon or splash screen or cursor movement - they must be being triggered by something in Chrome which is making use of the DRM atomic API.


My guess is this was started by:
https://chromium.googlesource.com/chromium/src/+/542cab3c0d82a89a71c4f3b22e7ebe4762e7f8cb
Labels: OS-Chrome
I wonder if this isn't as simple as
https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/1181221

The code was written at the time explicit fencing was not there, so the assertion was true. We have explicit fencing now, so skipping implicit fencing rather than warning could make more sense.
Cc: dcasta...@chromium.org
Owner: dbehr@chromium.org
Dominik, would you have some time to check if my CL makes any sense?
Cc: akhouderchah@chromium.org bleung@chromium.org
 Issue 875969  has been merged into this issue.
 Issue 875985  has been merged into this issue.
Project Member

Comment 17 by bugdroid1@chromium.org, Aug 21

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/50eefca8fb70d90d2d068e7e52da06c5d0522d68

commit 50eefca8fb70d90d2d068e7e52da06c5d0522d68
Author: Tomasz Figa <tfiga@chromium.org>
Date: Tue Aug 21 17:33:57 2018

CHROMIUM: drm/atomic_helper: Skip implicit sync if explicit fence is given

If the plane state being committed includes an explicit fence, the user
space controls the synchronization explicitly and there is no need to
attach implicit fences. Make drm_atomic_add_implicit_fences() skip such
planes.

BUG= chromium:873822 
TEST=Boot kevin to UI and see no warnings from
     drm_atomic_add_implicit_fences()

Change-Id: If7ea2d5a4a1af45de1ae35baa75e6843086f978c
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1181221
Reviewed-by: Dominik Behr <dbehr@chromium.org>

[modify] https://crrev.com/50eefca8fb70d90d2d068e7e52da06c5d0522d68/drivers/gpu/drm/drm_atomic_helper.c

Owner: tfiga@chromium.org
Status: Fixed (was: Assigned)
This should be gone now. Thanks dbehr@ for review.
Cc: derat@chromium.org mnissler@chromium.org ejcaruso@chromium.org tfiga@google.com hidehiko@chromium.org
 Issue 875710  has been merged into this issue.

Sign in to add a comment