snd_hda_parse_hdmi_codec() deadlock |
||
Issue descriptionI am seeing a lot of these in the hung_tasks bucket on crash.corp, all coming from Bay Trail Chromebooks (kip, gnawty, enguarde) running Linux 3.10: <6>[ 9367.837254] cras D ffff8801625a6250 0 1255 1195 0x00000000 <5>[ 9367.837345] ffff8801524cbbe0 0000000000000082 ffff8801524c5330 ffff8801524cbfd8 <5>[ 9367.837420] ffff8801524cbfd8 0000000000012240 ffff8801625a6250 ffff8801796ac320 <5>[ 9367.837449] 00000000ffffffff ffff8801625a6250 ffff8801796ac328 ffff8801796ac340 <5>[ 9367.837479] Call Trace: <5>[ 9367.837501] [<ffffffff87522228>] schedule+0x64/0x66 <5>[ 9367.837524] [<ffffffff8752260b>] schedule_preempt_disabled+0xe/0x10 <5>[ 9367.837549] [<ffffffff875213c5>] __mutex_lock_slowpath+0x14d/0x1c7 <5>[ 9367.837573] [<ffffffff8752073a>] mutex_lock+0x1f/0x2f <5>[ 9367.837594] [<ffffffff8752073a>] ? mutex_lock+0x1f/0x2f <5>[ 9367.837617] [<ffffffffc023f405>] 0xffffffffc023f404 <5>[ 9367.837639] [<ffffffff87413f34>] snd_ctl_elem_info+0x59/0xf0 <5>[ 9367.837661] [<ffffffff87414042>] snd_ctl_elem_info_user+0x77/0xd6 <5>[ 9367.837686] [<ffffffff874152e7>] snd_ctl_ioctl+0x2dd/0x4da <5>[ 9367.837709] [<ffffffff87105463>] do_vfs_ioctl+0x35b/0x41c <5>[ 9367.837731] [<ffffffff870f61d1>] ? fsnotify_access+0x5a/0x61 <5>[ 9367.837778] [<ffffffff87105591>] SyS_ioctl+0x6d/0xa5 <5>[ 9367.837838] [<ffffffff87523d02>] system_call_fastpath+0x16/0x1b <6>[ 9367.837909] kworker/u4:18 D ffff8801797aa200 0 12707 2 0x00000000 <6>[ 9367.837947] Workqueue: hd-audio0 snd_hda_parse_hdmi_codec [snd_hda_codec_hdmi] <5>[ 9367.838001] ffff880123f47ae0 0000000000000046 ffff88017810f170 ffff880123f47fd8 <5>[ 9367.838031] ffff880123f47fd8 0000000000012240 ffff8801797a9e40 ffff8801796ac320 <5>[ 9367.838075] 00000000ffffffff ffff8801797a9e40 ffff8801796ac328 ffff8801796ac340 <5>[ 9367.838136] Call Trace: <5>[ 9367.838161] [<ffffffff87522228>] schedule+0x64/0x66 <5>[ 9367.838199] [<ffffffff8752260b>] schedule_preempt_disabled+0xe/0x10 <5>[ 9367.838223] [<ffffffff875213c5>] __mutex_lock_slowpath+0x14d/0x1c7 <5>[ 9367.838253] [<ffffffff8752073a>] mutex_lock+0x1f/0x2f <5>[ 9367.838275] [<ffffffff8752073a>] ? mutex_lock+0x1f/0x2f <5>[ 9367.838308] [<ffffffffc0240dbb>] snd_hda_parse_hdmi_codec+0x5aa/0x13f4 [snd_hda_codec_hdmi] <5>[ 9367.838343] [<ffffffffc0240ffe>] snd_hda_parse_hdmi_codec+0x7ed/0x13f4 [snd_hda_codec_hdmi] <5>[ 9367.838383] [<ffffffffc01831c5>] snd_hda_jack_set_dirty_all+0x6a/0xff [snd_hda_codec] <5>[ 9367.838417] [<ffffffffc018379c>] snd_hda_jack_poll_all+0x6b/0x249 [snd_hda_codec] <5>[ 9367.838450] [<ffffffffc017d1aa>] snd_hda_unlock_devices+0x5e/0xcc [snd_hda_codec] <5>[ 9367.838484] [<ffffffffc0182761>] snd_hda_build_controls+0x1e5/0x208 [snd_hda_codec] <5>[ 9367.838523] [<ffffffffc017f869>] snd_hda_power_save+0x115/0x3b3 [snd_hda_codec] <5>[ 9367.838556] [<ffffffffc017f996>] snd_hda_power_save+0x242/0x3b3 [snd_hda_codec] <5>[ 9367.838581] [<ffffffff87521b74>] ? do_wait_for_common+0xc7/0x144 <5>[ 9367.838613] [<ffffffffc017fb45>] snd_hda_codec_read+0x3e/0x65 [snd_hda_codec] <5>[ 9367.838641] [<ffffffffc0241e29>] snd_hdmi_get_eld_size+0x1e/0x20 [snd_hda_codec_hdmi] <5>[ 9367.838669] [<ffffffffc0241e54>] snd_hdmi_get_eld+0x29/0x99 [snd_hda_codec_hdmi] <5>[ 9367.838698] [<ffffffffc0240df5>] snd_hda_parse_hdmi_codec+0x5e4/0x13f4 [snd_hda_codec_hdmi] <5>[ 9367.838724] [<ffffffff87522f17>] ? _raw_spin_unlock_irq+0xe/0x11 <5>[ 9367.838751] [<ffffffffc02410d6>] snd_hda_parse_hdmi_codec+0x8c5/0x13f4 [snd_hda_codec_hdmi] <5>[ 9367.838778] [<ffffffff8704b6b9>] process_one_work+0x182/0x2bd <5>[ 9367.838802] [<ffffffff8704c9a5>] worker_thread+0x143/0x202 <5>[ 9367.838825] [<ffffffff8704c862>] ? rescuer_thread+0x2c3/0x2c3 <5>[ 9367.838848] [<ffffffff8705186c>] kthread+0xc0/0xc8 <5>[ 9367.838871] [<ffffffff870517ac>] ? __kthread_parkme+0x6b/0x6b <5>[ 9367.838893] [<ffffffff87523c5c>] ret_from_fork+0x7c/0xb0 <5>[ 9367.838915] [<ffffffff870517ac>] ? __kthread_parkme+0x6b/0x6b This query shows a bunch of them: https://goto.google.com/crefx There are a few hda patches upstream which may or may not be related, e.g. commit 664c715573c2c116c2d8f5de7d59ce85a98a1751 Author: Takashi Iwai <tiwai@suse.de> Date: Wed Apr 8 11:43:14 2015 +0200 ALSA: hda - Work around races of power up/down with runtime PM I don't see any reports of this on non-Google hardware. We have some local CHROMIUM patches in that driver which may be causing its behavior to diverge from standard 3.10 behavior: commit 0cc1b72687c553f9443bd627148f83974164e294 Author: zhuo-hao <zhuo-hao.lee@intel.com> Date: Fri Sep 4 11:13:05 2015 +0800 CHROMIUM: Workaround for race between i915 and HDA resume commit c2ae349cfb79bb73665f88f49748d0b3f3739389 Author: U. Artie Eoff <ullysses.a.eoff@intel.com> Date: Mon Nov 17 11:51:03 2014 -0800 CHROMIUM: ALSA: hda - skip PM ops if async probe is not finished
,
Nov 24 2016
From the attached log, there was an error before system went to S3: <6>[ 3178.943260] call 0000:00:02.0+ returned 0 after 111063 usecs <4>[ 3182.880461] snd_hda_intel 0000:00:1b.0: azx_get_response timeout, switching to polling mode: last cmd=0x201f0500 <4>[ 3183.882273] snd_hda_intel 0000:00:1b.0: No response from codec, disabling MSI: last cmd=0x201f0500 <3>[ 3184.884088] hda_intel: azx_get_response timeout, switching to single_cmd mode: last cmd=0x201f0500 <6>[ 3184.884191] snd_hda_intel 0000:00:1b.0: get_response timeout: IRS=0x1 <6>[ 3184.884287] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20270503 <6>[ 3184.884383] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20370503 <6>[ 3184.884478] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20470503 <6>[ 3184.884573] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20570503 <6>[ 3184.884668] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20670503 <6>[ 3184.884763] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x201f0500 <6>[ 3184.884859] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20170503 <6>[ 3184.884954] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20270503 <6>[ 3184.885049] snd_hda_intel 0000:00:1b.0: send_cmd timeout: IRS=0x1, val=0x20370503 <6>[ 3184.914115] call 0000:00:1b.0+ returned 0 after 5937332 usecs <6>[ 3184.914145] PM: suspend of devices complete after 6244.388 msecs It shouldn't related to my patch "commit 0cc1b72687c553f9443bd627148f83974164e294" because that patch fixed the race between i915 and HDA on resume path. However, i wonder if the similar fix should be done on suspend path as well. Do you know the reproduce steps to trigger this bug?
,
Nov 24 2016
> From the attached log, there was an error before system went to S3: Good catch, I checked a couple of other reports and see the same azx_get_response timeout in those logs. > Do you know the reproduce steps to trigger this bug? Unfortunately I only know about it from our crash reporting interface. We have a program that classifies incoming kernel crash reports as "hung_tasks" if they contain a message of the form "INFO: task <X> blocked for more than <Y> seconds." A significant percentage of the crashes in the hung_tasks bucket look like the crash in the OP. Maybe it would be possible to reproduce it by running continuous suspend/resume loops on one of the 3.10/BayTrail Chromebooks? I did not try.
,
May 3 2018
|
||
►
Sign in to add a comment |
||
Comment 1 by dgreid@chromium.org
, Nov 24 2016Status: Assigned (was: Untriaged)