New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 760802 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Sep 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android , Chrome
Pri: 2
Type: Bug



Sign in to add a comment

killing surfaceflinger results in BUG_ON because mali signals already signaled fence

Project Member Reported by dbehr@chromium.org, Aug 31 2017

Issue description

The error code can be only set once, before fence is signaled.
 
Can you post the actual BUG_ON for those of us that it's not obvious for?  :-)
Ah, I figured it out.  It must be this one in dma_fence_set_error():

	BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags));

Cc: dbehr@chromium.org
 Issue 762580  has been merged into this issue.
Labels: M-61 M-62
For the record, crash was:

[  136.440970] ------------[ cut here ]------------
[  136.445627] kernel BUG at /mnt/host/source/src/third_party/kernel/v4.4/include/linux/dma-fence.h:421!
[  136.454875] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  136.460379] Modules linked in: ip6t_REJECT nf_reject_ipv6 xt_TCPMSS ip6table_mangle veth cmac rfcomm btusb btrtl btbcm btintel uinput ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat uvcvideo videobuf2_vmalloc mwifiex_pcie mwifiex zram bluetooth xt_mark fuse bridge stp llc cfg80211 ip6table_filter smsc95xx usbnet mii joydev
[  136.491430] CPU: 1 PID: 6896 Comm: Thread-16 Tainted: G        W       4.4.79 #175
[  136.499020] Hardware name: Google Kevin (DT)
[  136.503307] task: ffffffc02d141700 ti: ffffffc02d150000 task.ti: ffffffc02d150000
[  136.510821] PC is at kbase_sync_signal_fence+0x2c/0x58
[  136.515979] LR is at kbase_finish_soft_job+0x80/0xb4
[  136.520960] pc : [<ffffffc00064d1b8>] lr : [<ffffffc00064465c>] pstate: 60000145
[  136.528375] sp : ffffffc02d153950
[  136.531702] x29: ffffffc02d153950 x28: ffffff8006f5d628 
[  136.537057] x27: ffffff8006f5d6b0 x26: ffffff8006f5c630 
[  136.542410] x25: ffffff8006f6d000 x24: 0000000000000000 
[  136.547763] x23: 0000000000000000 x22: ffffff8006f41000 
[  136.553115] x21: ffffffc02d153a48 x20: ffffffc02d153a38 
[  136.558469] x19: ffffffc02c92a580 x18: 00000000000101d0 
[  136.563823] x17: 00000000383b64b2 x16: ffffffc05e459e00 
[  136.569177] x15: ffffffc05e33a5d0 x14: 000001d000000000 
[  136.574530] x13: ffffff8006f5c610 x12: ffffffc02d153a48 
[  136.579884] x11: ffffff8006f5c620 x10: 0000000000000001 
[  136.585237] x9 : 0000000000000004 x8 : ffffffc02c8bf488 
[  136.590589] x7 : bbbbbbbbbbbbbbbb x6 : 5a5a5a5a5a5a5a5a 
[  136.595942] x5 : 0000000000000000 x4 : 0000000000000202 
[  136.601295] x3 : ffffffc001086c48 x2 : 0000000000000000 
[  136.606647] x1 : 00000000fffffff2 x0 : 0000000000000003 
...
...
[  137.911502] Call trace:
[  137.913972] [<ffffffc00064d1b8>] kbase_sync_signal_fence+0x2c/0x58
[  137.920172] [<ffffffc00064465c>] kbase_finish_soft_job+0x80/0xb4
[  137.926199] [<ffffffc000639b20>] jd_done_nolock+0x444/0x510
[  137.931790] [<ffffffc0006447ac>] kbase_cancel_soft_job+0x74/0xa8
[  137.937817] [<ffffffc00063b240>] kbase_jd_zap_context+0x68/0xfc
[  137.943755] [<ffffffc000640790>] kbase_destroy_context+0x48/0x19c
[  137.949868] [<ffffffc00064aa5c>] kbase_release+0x13c/0x174
[  137.955372] [<ffffffc0003953e0>] __fput+0x104/0x1cc
[  137.960266] [<ffffffc00039551c>] ____fput+0x20/0x2c
[  137.965160] [<ffffffc0002445f0>] task_work_run+0xa4/0xd4
[  137.970491] [<ffffffc000224928>] do_exit+0x58c/0x9f4
[  137.975473] [<ffffffc0002262f4>] do_group_exit+0x50/0xb0
[  137.980803] [<ffffffc000233d44>] get_signal+0x848/0x884
[  137.986047] [<ffffffc000207f00>] do_signal+0xb0/0x518
[  137.991118] [<ffffffc00020855c>] do_notify_resume+0x28/0x6c
[  137.996708] [<ffffffc000203d28>] work_pending+0x1c/0x20
[  138.001950] Code: f94017a1 36f80121 f9402660 36000040 (d4210000) 
[  138.014386] ---[ end trace 6f62071c9dba0573 ]---
Labels: -M-61
<https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/644410> is marked Ready, so it should go in shortly.

Since we're not seeing real issues, I'd say that we target M-62, but if people think this is important enough to request merge to M-61 then shout.  Certainly it looks very safe / easy to test / low risk.
Project Member

Comment 6 by bugdroid1@chromium.org, Sep 6 2017

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/466855709f5ac0afdbe27a8079437974bd314ec8

commit 466855709f5ac0afdbe27a8079437974bd314ec8
Author: Dominik Behr <dbehr@chromium.org>
Date: Wed Sep 06 19:42:45 2017

CHROMIUM: MALI: fix signaling of already signaled fences

Error code can be only set once, before fence is signaled.
So before setting error code, test if fence is signaled and do
not try to signal fence that has been already signaled.

BUG= chromium:760802 
TEST=killall -9 surfaceflinger, no BUG_ON reboot

Change-Id: Ibc5f61381d51af3fa31903e3df43e715dc03e661
Signed-off-by: Dominik Behr <dbehr@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/644410
Commit-Ready: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>

[modify] https://crrev.com/466855709f5ac0afdbe27a8079437974bd314ec8/drivers/gpu/arm/midgard/mali_kbase_sync.c

Labels: -Pri-3 Merge-Request-62 Pri-2
As per above, voting for M-62 on this.  It might fix a few crashes on M-61 too and it's pretty safe, but I don't think people are banging down the doors for this fix.

Adding Merge-Request-62.
Project Member

Comment 8 by sheriffbot@chromium.org, Sep 7 2017

Labels: -Merge-Request-62 Hotlist-Merge-Approved Merge-Approved-62
Your change meets the bar and is auto-approved for M62. Please go ahead and merge the CL to branch 3202 manually. Please contact milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 9 by bugdroid1@chromium.org, Sep 7 2017

Labels: merge-merged-release-R62-9901.B-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/de32fb4df33564df567473b2c9989dcc97ad0648

commit de32fb4df33564df567473b2c9989dcc97ad0648
Author: Dominik Behr <dbehr@chromium.org>
Date: Thu Sep 07 20:11:51 2017

CHROMIUM: MALI: fix signaling of already signaled fences

Error code can be only set once, before fence is signaled.
So before setting error code, test if fence is signaled and do
not try to signal fence that has been already signaled.

BUG= chromium:760802 
TEST=killall -9 surfaceflinger, no BUG_ON reboot

Change-Id: Ibc5f61381d51af3fa31903e3df43e715dc03e661
Signed-off-by: Dominik Behr <dbehr@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/644410
Commit-Ready: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
(cherry picked from commit 466855709f5ac0afdbe27a8079437974bd314ec8)
Reviewed-on: https://chromium-review.googlesource.com/655997

[modify] https://crrev.com/de32fb4df33564df567473b2c9989dcc97ad0648/drivers/gpu/arm/midgard/mali_kbase_sync.c

Labels: -Hotlist-Merge-Approved -Merge-Approved-62 Merge-Merged
Status: Fixed (was: Started)

Sign in to add a comment