CrOS: loopback_latency hangs |
||||||||
Issue description
Device: eve
Version: Local build synced on Oct 20.
Steps to reproduce:
- loopback_latency -n 4000
Observed:
loopback_latency never finished.
Other information:
loopback_latency calls snd_pcm_delay, which holds a snd_pcm lock. snd_pcm_delay calls into cras snd_pcm_sw_params_current, which tries to hold the same snd_pcm lock.
int snd_pcm_delay(snd_pcm_t *pcm, snd_pcm_sframes_t *delayp)
{
int err;
assert(pcm);
if (CHECK_SANITY(! pcm->setup)) {
SNDMSG("PCM not set up");
return -EIO;
}
snd_pcm_lock(pcm);
err = __snd_pcm_delay(pcm, delayp);
snd_pcm_unlock(pcm);
return err;
}
int snd_pcm_sw_params_current(snd_pcm_t *pcm, snd_pcm_sw_params_t *params)
{
assert(pcm && params);
if (CHECK_SANITY(! pcm->setup)) {
SNDMSG("PCM not set up");
return -EIO;
}
__snd_pcm_lock(pcm); /* forced lock due to pcm field changes */
params->proto = SNDRV_PCM_VERSION;
params->tstamp_mode = pcm->tstamp_mode;
params->tstamp_type = pcm->tstamp_type;
params->period_step = pcm->period_step;
params->sleep_min = 0;
params->avail_min = pcm->avail_min;
sw_set_period_event(params, pcm->period_event);
params->xfer_align = 1;
params->start_threshold = pcm->start_threshold;
params->stop_threshold = pcm->stop_threshold;
params->silence_threshold = pcm->silence_threshold;
params->silence_size = pcm->silence_size;
params->boundary = pcm->boundary;
__snd_pcm_unlock(pcm);
return 0;
}
,
Oct 31 2017
Here's gdb stack when loopback_latency hangs.
(gdb) thread apply all bt
Thread 5 (Thread 0x7ffff60fb700 (LWP 8843)):
#0 0x00007ffff75f71ad in poll () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007ffff7f21568 in poll(pollfd*, unsigned long pass_object_size1, int) (__timeout=-1, __fds=<optimized out>, __nfds=<optimized out>)
at ../../../../../../../../usr/include/bits/poll2.h:48
#2 read_with_wake_fd (len=12, wake_fd=<optimized out>, read_fd=<optimized out>, buf=<optimized out>) at libcras/cras_client.c:991
#3 audio_thread (arg=0x7ffff0001250) at libcras/cras_client.c:1246
#4 0x00007ffff78bc2b8 in start_thread (arg=0x7ffff60fb700) at pthread_create.c:333
#5 0x00007ffff7600fad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 4 (Thread 0x7ffff58fa700 (LWP 8844)):
#0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x000055555555659b in alsa_capture (arg=0x55555556d900) at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:476
#2 0x00007ffff78bc2b8 in start_thread (arg=0x7ffff58fa700) at pthread_create.c:333
#3 0x00007ffff7600fad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 3 (Thread 0x7ffff68fc700 (LWP 8842)):
#0 0x00007ffff75f71ad in poll () at ../sysdeps/unix/syscall-template.S:84
#1 0x00007ffff7f1db14 in poll(pollfd*, unsigned long pass_object_size1, int) (__timeout=-1, __fds=<optimized out>, __nfds=<optimized out>)
at ../../../../../../../../usr/include/bits/poll2.h:48
#2 client_thread (arg=0x55555555b830) at libcras/cras_client.c:1998
#3 0x00007ffff78bc2b8 in start_thread (arg=0x7ffff68fc700) at pthread_create.c:333
#4 0x00007ffff7600fad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 2 (Thread 0x7ffff70fd700 (LWP 8841)):
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007ffff78beac5 in __GI___pthread_mutex_lock (mutex=0x55555555c000) at ../nptl/pthread_mutex_lock.c:80
#2 0x00007ffff7f773b2 in __snd_pcm_lock (pcm=0x55555555be30) at ../../../alsa-lib-1.1.3/src/pcm/pcm_local.h:1109
#3 snd_pcm_sw_params_current (pcm=0x55555555be30, params=0x7ffff70fcd90) at ../../../alsa-lib-1.1.3/src/pcm/pcm.c:5948
#4 0x00007ffff7ff3f78 in get_boundary (pcm=0x55555555c000, boundary=0x7ffff70fce78) at alsa_plugin/pcm_cras.c:239
#5 0x00007ffff7ff3b41 in snd_pcm_cras_delay (io=0x55555555b370, delayp=0x55555555a0e8 <playback_delay_frames>) at alsa_plugin/pcm_cras.c:312
#6 0x00007ffff7f7285f in __snd_pcm_delay (pcm=0x55555555be30, delayp=0x55555555a0e8 <playback_delay_frames>) at ../../../alsa-lib-1.1.3/src/pcm/pcm_local.h:448
#7 snd_pcm_delay (pcm=0x55555555be30, delayp=0x55555555a0e8 <playback_delay_frames>) at ../../../alsa-lib-1.1.3/src/pcm/pcm.c:1068
#8 0x00005555555563db in alsa_play (arg=0x55555555be30) at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:429
#9 0x00007ffff78bc2b8 in start_thread (arg=0x7ffff70fd700) at pthread_create.c:333
#10 0x00007ffff7600fad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 1 (Thread 0x7ffff7f10740 (LWP 8837)):
#0 0x00007ffff78bd558 in pthread_join (threadid=140737313221072, thread_return=0x0) at pthread_join.c:90
#1 0x0000555555556042 in alsa_test_latency (play_dev=0x555555557af0 "default", cap_dev=0x555555557af0 "default") at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:633
#2 0x00005555555569b5 in main (argc=3, argv=0x7fffffffeb18) at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:680
,
Oct 31 2017
Here's the CL that moved snd_pcm_sw_params_current from .start to .delay. https://chromium.git.corp.google.com/chromiumos/third_party/adhd/+/1c805613fa6539ce8e284687f1e8421d56309c37%5E%21/#F0
,
Oct 31 2017
The change description of patch in #4 says "Delay can be called before start don't cache the value of boundary in start, get it from delay where it is needed." But alsa-lib comments say "The function #snd_pcm_delay() returns the delay in samples. For playback, it means count of samples in the ring buffer before the next sample will be sent to DAC. For capture, it means count of samples in the ring buffer before the next sample will be captured from ADC. It works only when the stream is in the running or draining (playback only) state."
,
Nov 3 2017
Here's an attempt of the fix. It's abandoned. https://chromium-review.googlesource.com/c/chromiumos/overlays/portage-stable/+/746383. I copied the comments from the patch. (1) First solution is to not call snd_pcm_sw_params_current in .delay. Move it back to .start because the delay value only makes sense after .start. This is the solution Jimmy and I prefer. (2) Second solution is to set need_lock=0 in snd_pcm_t. That will need to build cras pcm plugin with pcm_local.h. It's not very good to duplicate a local header file and introduce the dependency. (3) loopback_latency has two modes -- talking to cras or talking to alsa (using cras as a plugin). Indeed we can make loopback_latency only talk to cras. But Jimmy wants to keep the alsa mode so we can test different paths in command line. Also loopback_latency talking to cras is also broken at the moment and I'm debugging it. Finally when cras is served as Alsa plugin and snd_pcm_delay is called, it will always deadlock. This has to be fixed no matter what.
,
Nov 3 2017
I moved snd_pcm_sw_params_current from .delay to .start in pcm_cras.c. It got deadlock in another place. (gdb) thread apply all bt Thread 3 (Thread 0x7fffee8fc700 (LWP 20332)): #0 pthread_cond_wait () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00005555555565eb in alsa_capture (arg=0x55555556dae0) at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:480 #2 0x00007ffff78bc2b8 in start_thread (arg=0x7fffee8fc700) at pthread_create.c:333 #3 0x00007ffff7600fad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 2 (Thread 0x7ffff70fd700 (LWP 20331)): #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007ffff78beac5 in __GI___pthread_mutex_lock (mutex=0x55555555bef0) at ../nptl/pthread_mutex_lock.c:80 #2 0x00007ffff7f773b2 in __snd_pcm_lock (pcm=0x55555555bd20) at ../../../alsa-lib-1.1.3/src/pcm/pcm_local.h:1109 #3 snd_pcm_sw_params_current (pcm=0x55555555bd20, params=0x7ffff70fccb0) at ../../../alsa-lib-1.1.3/src/pcm/pcm.c:5948 #4 0x00007ffff7ff3d38 in get_boundary (pcm=0x55555555bef0, boundary=0x55555555b448) at alsa_plugin/pcm_cras.c:241 #5 0x00007ffff7ff38f5 in snd_pcm_cras_start (io=0x55555555b370) at alsa_plugin/pcm_cras.c:265 #6 0x00007ffff7fb5e05 in snd_pcm_ioplug_start (pcm=0x55555555bd20) at ../../../alsa-lib-1.1.3/src/pcm/pcm_ioplug.c:458 #7 0x00007ffff7f78475 in __snd_pcm_start (pcm=0x55555555bd20) at ../../../alsa-lib-1.1.3/src/pcm/pcm_local.h:433 #8 snd1_pcm_write_areas (pcm=0x55555555bd20, areas=0x7ffff70fce40, offset=0, size=240, func=0x7ffff7f83360 <snd_pcm_mmap_write_areas>) at ../../../alsa-lib-1.1.3/src/pcm/pcm.c:7303 #9 0x00007ffff7f83333 in snd_pcm_mmap_writei (pcm=0x55555555bd20, buffer=<optimized out>, size=140737346556284) at ../../../alsa-lib-1.1.3/src/pcm/pcm_mmap.c:153 #10 0x00007ffff7fb62fb in snd_pcm_ioplug_writei (pcm=0x55555555bd20, buffer=0xfffffffffffffe00, size=240) at ../../../alsa-lib-1.1.3/src/pcm/pcm_ioplug.c:575 #11 0x00005555555563bb in alsa_play (arg=0x55555555bd20) at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:422 #12 0x00007ffff78bc2b8 in start_thread (arg=0x7ffff70fd700) at pthread_create.c:333 #13 0x00007ffff7600fad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 1 (Thread 0x7ffff7f10740 (LWP 20327)): #0 0x00007ffff78bd558 in pthread_join (threadid=140737195788752, thread_return=0x0) at pthread_join.c:90 #1 0x0000555555556064 in alsa_test_latency (play_dev=0x555555557b70 "default", cap_dev=0x555555557b70 "default") at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:638 #2 0x0000555555556a35 in main (argc=3, argv=0x7fffffffeb18) at /mnt/host/source/src/platform/audiotest/src/loopback_latency.c:685
,
Nov 3 2017
Now I'm trying to use cras mode loopback_latency. I found loopback_latency could run if I increased buffer_frames parameter. I'm trying to understand why 480 didn't work and 500 worked. Also the measure latency and reported latency had too much difference. But at least loopback_latency won't hang. # ./loopback_latency -n 4000 -c -b 500 Got noise Measured Latency: 12575 uS. Reported Latency: 791 uS.
,
Nov 3 2017
For comparison, here's the error when I didn't give -b. # ./loopback_latency -n 4000 -c Stream error -12 Stream error -12 in /var/log/messages 2017-11-03T15:03:41.104024+08:00 ERR cras_server[23588]: rstream: cb_threshold too low 2017-11-03T15:03:41.104107+08:00 ERR cras_server[23588]: rstream: cb_threshold too low
,
Nov 3 2017
I printed cb_threshold and rate.
2017-11-03T16:24:51.090492+08:00 ERR cras_server[3634]: rstream: cb_threshold too low. cb_threshold=48, rate=48000
2017-11-03T16:24:51.090586+08:00 ERR cras_server[3634]: rstream: cb_threshold too low. cb_threshold=48, rate=48000
if (!buffer_meets_size_limit(cb_threshold, format->frame_rate)) {
syslog(LOG_ERR, "rstream: cb_threshold too low. cb_threshold=%d, rate=%d\n", (int)cb_threshold, (int)format->frame_rate);
return -EINVAL;
}
int buffer_meets_size_limit(size_t buffer_size, size_t rate)
{
return buffer_size > (CRAS_MIN_BUFFER_TIME_IN_US * rate) / 1000000;
}
,
Nov 3 2017
In loopback_latency.c, buffer_frames is 480 by default and cb_threshold is 480/10=48. static snd_pcm_uframes_t buffer_frames = 480; size_t cb_threshold = buffer_frames / 10;
,
Nov 3 2017
I modified loopback_latency.c as Chinyue suggested. It worked.
@@ -364,8 +366,8 @@ static int cras_add_stream(struct cras_client *client,
struct cras_audio_format *aud_format;
cras_playback_cb_t aud_cb;
cras_error_cb_t error_cb;
- size_t cb_threshold = buffer_frames / 10;
- size_t min_cb_level = buffer_frames / 10;
+ size_t cb_threshold = buffer_frames;
+ size_t min_cb_level = buffer_frames;
# ./loopback_latency -n 4000 -c
Got noise
Measured Latency: 29850 uS.
Reported Latency: 9732 uS.
,
Nov 3 2017
I'll upload a patch.
,
Nov 3 2017
,
Nov 3 2017
,
Nov 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/audiotest/+/ee6aafc127d32252bd90435977f340f400cca12c commit ee6aafc127d32252bd90435977f340f400cca12c Author: Wu-Cheng Li <wuchengli@google.com> Date: Fri Nov 03 15:56:22 2017 loopback_latency: do not divide buffer_frames by 10. The default buffer_frames is 480. 480/10=48. Setting cb_threshold to 48 is too small and Cras will reject it. The division by 10 was added four years ago. It was an arbitrary value to get the buffer frames small. Actually using 480 and not dividing it by 10 is more reasonable. BUG= chromium:779936 TEST=Run loopback_latency -c on eve and it does not show error. Change-Id: I3d03fccc23e3695429748e2667b7f4683f234597 Reviewed-on: https://chromium-review.googlesource.com/753261 Commit-Ready: Wu-Cheng Li <wuchengli@chromium.org> Tested-by: Wu-Cheng Li <wuchengli@chromium.org> Reviewed-by: Chinyue Chen <chinyue@chromium.org> Reviewed-by: Wu-Cheng Li <wuchengli@chromium.org> [modify] https://crrev.com/ee6aafc127d32252bd90435977f340f400cca12c/src/loopback_latency.c
,
Nov 4 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/ee593a9805788939eff6c24bbb51185e57bfce78 commit ee593a9805788939eff6c24bbb51185e57bfce78 Author: Wu-Cheng Li <wuchengli@google.com> Date: Sat Nov 04 04:56:53 2017 audio_helper: pass -c to loopback_latency. loopback_latency has two modes - using Cras or Alsa. Alsa mode uses Cras plugin by default and it is currently broken. Alsa mode is still working if we do not use Cras plugin and pass the devices like hw:0,0 to the parameters. Here we do not care which mode to use as long as we can get the loopback latency. So use Cras mode in audio_helper and leave Alsa mode for command line usage. BUG= chromium:779936 TEST=Run audio_CrasLoopback on eve. CQ-DEPEND=CL:753261 Change-Id: Icf8497d6040af0a12e08c210e5066a0a5945e038 Reviewed-on: https://chromium-review.googlesource.com/753541 Commit-Ready: Wu-Cheng Li <wuchengli@chromium.org> Tested-by: Wu-Cheng Li <wuchengli@chromium.org> Reviewed-by: Cheng-Yi Chiang <cychiang@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/ee593a9805788939eff6c24bbb51185e57bfce78/client/cros/audio/audio_helper.py
,
Nov 6 2017
From the test dashboard, loopback_latency no longer hanged.
,
Nov 6 2017
I'll remove snd_pcm_cras_delay because it will cause deadback. Lower the priority. I'll come back to this when I have time.
,
Nov 8 2017
,
Nov 9 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/audiotest/+/2587466256247b091d032e56b7ccff72fe1b9b2e commit 2587466256247b091d032e56b7ccff72fe1b9b2e Author: Wu-Cheng Li <wuchengli@google.com> Date: Thu Nov 09 15:46:36 2017 loopback_latency: do not use cras plugin. Cras plugin does not support snd_pcm_delay anymore. The loopback latency will not be accurate. Remove cras plugin support in loopback_latency. BUG= chromium:779936 TEST=Run loopback_latency. Change-Id: Id4d2abfa872b1b6c5eb2ce2048600cd1ef3fdc2a Reviewed-on: https://chromium-review.googlesource.com/760124 Commit-Ready: Wu-Cheng Li <wuchengli@chromium.org> Tested-by: Wu-Cheng Li <wuchengli@chromium.org> Reviewed-by: Wu-Cheng Li <wuchengli@chromium.org> [modify] https://crrev.com/2587466256247b091d032e56b7ccff72fe1b9b2e/src/loopback_latency.c
,
Nov 9 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/adhd/+/6991c5aac3411ec5b4b12e3f48c05d1b06fd8cf1 commit 6991c5aac3411ec5b4b12e3f48c05d1b06fd8cf1 Author: Wu-Cheng Li <wuchengli@google.com> Date: Thu Nov 09 15:46:33 2017 CRAS: alsa-plugin - remove snd_pcm_cras_delay PCM functions became thread-safe in 1.1.2. Some functions like snd_pcm_delay and snd_pcm_sw_params_current will hold a lock. snd_pcm_cras_delay is called by snd_pcm_delay, but snd_pcm_cras_delay calls snd_pcm_sw_params_current. This causes a deadlock. There are several options that are not good: (1) Define --disable-thread-safety in alsa-lib. This will be inconsistent with upstream alsa-lib. (2) Set snd_pcm_t.need_lock=0. need_lock is in pcm_local.h and it's not easy to build with it. Also, need_lock means the plugin is thread safe and doesn't need locking in alsa-lib. Our plugin is not thread safe. (3) Not to call snd_pcm_sw_params_current in snd_pcm_cras_delay. Calling snd_pcm_sw_params_current in .start introduces a similar deadlock. Also it's more accurate to get the latest boundary in .delay. I don't see a good solution. Let's remove snd_pcm_cras_delay. BUG= chromium:779936 TEST=Run loopback_latency -n 4000 and it does not hang. Change-Id: I69b143aee2e95f8ef8f1a2c0f1604064255d0017 Reviewed-on: https://chromium-review.googlesource.com/758796 Commit-Ready: Wu-Cheng Li <wuchengli@chromium.org> Tested-by: Wu-Cheng Li <wuchengli@chromium.org> Reviewed-by: Cheng-Yi Chiang <cychiang@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/6991c5aac3411ec5b4b12e3f48c05d1b06fd8cf1/cras/src/alsa_plugin/pcm_cras.c
,
Nov 9 2017
,
Dec 14 2017
,
Jan 22 2018
,
Jan 23 2018
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by wuchengli@chromium.org
, Oct 31 2017