New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 751047 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

CRAS: fix wake time scheduling for hotword stream

Project Member Reported by hychao@chromium.org, Aug 1 2017

Issue description

For codec driver who supports continuous hotword stream. It is noticed that CRAS cannot handle the data capture correctly.

One observation is that even the stream has flag BULK_AUDIO_OK, CRAS still schedule data read periodically as normal input stream, and device driver already overruns data.

Below are two changes to test and try to solve the problem. Work is still in progress.

remote: New Changes:
remote:   https://chromium-review.googlesource.com/594761 CRAS: test_client - listen_for_hotword to capture continuous stream
remote:   https://chromium-review.googlesource.com/594762 TEST: adjust wake timing for burst data in hotword stream

 
https://chromium-review.googlesource.com/599791 CHROMIUM: ASoC: rt5514: reset dma_offset at pcm open

I found a bug in rt5514-spi not resetting dma_offset, so the last value causes a incorrect pointer position in next PCM open.

chinyue@ can you please apply this to kernel to see if the hotword stream has stable starting zero length?
Base on my test, the starting silence time is stable after applying the CL in #1.
Attached screen shots for consecutive key phrase tests.

If this also helps on Kevin. I'll upstream the patch.
stable-trigger-phrase.png
106 KB View Download
unstable-trigger-phrase-length.png
93.3 KB View Download
Re #1, I tested the patch on Kevin and it helps to stabilize the leading zeros.
On Kevin, if we use arecord to capture hotword stream:

1. amixer -c0 cset name='DSP Voice Wake Up' off; amixer -c0 cset name='DSP Voice Wake Up' on; arecord -f S16_LE -c 1 -r 16000 -D hw:0,4 /tmp/hot.wav
2. Say "ok google" + query
3. Hit Ctrl+C to stop recording

The captured stream is always like 3 seconds leading zeros + hotword + query. And it records the query completely


If we use cras_test_client to capture hotword stream:

1. cras_test_client --pin 10 --capture_file /tmp/hot.wav --rate 16000 --num_channels 1
2. Say "ok google" + query
3. Hit Ctrl+C to stop recording

The captured stream is leading zeros + hotword + cut query.
The leading zeros was in arbitrary length and hychao's patch in #1fixed it to be in consistent length.
However the query is cut, it didn't record till the point Ctrl+C is hit.


There's cras error in the log:
2017-08-03T20:28:30.132261+09:00 ERR cras_server[2144]: pcm_avail returned frames larger than buf_size: rk3399-gru-sound: :0,4: 71088 > 65536 for 1 times


And from dump_a output, found that cras was too slow to read all the captured data.
(please see attached for the full dump_a output)

The below is when hotword device started have data available.
The hw_level is high and CRAS reads too small/slow and the hw_level accumulates higher quickly


      7015.209319623  WAKE                           num_fds:1
      7015.209322540  IODEV_CB                       is_write:0
      7015.209369790  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.209365415
->    7015.209387581  READ_AUDIO                     dev:10 hw_level:8192 read:81
      7015.209424623  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.209432790  READ_AUDIO_DONE                read_remainder:0
      7015.209436873  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:0
      7015.209517373  DEV_SLEEP_TIME                 dev:10 wake:000007015.214486165
      7015.209520290  SLEEP                          sleep:000000000.004970542 longest_wake:000335125
      7015.214598498  WAKE                           num_fds:0
      7015.214622706  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.214621540
      7015.214632915  READ_AUDIO                     dev:10 hw_level:8112 read:81
      7015.214638456  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.214644290  READ_AUDIO_DONE                read_remainder:0
      7015.214646331  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:1
      7015.214695331  DEV_SLEEP_TIME                 dev:10 wake:000007015.219486165
      7015.214697081  SLEEP                          sleep:000000000.004792000 longest_wake:000335125
      7015.219589206  WAKE                           num_fds:0
      7015.219610206  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.219609331
      7015.219623331  READ_AUDIO                     dev:10 hw_level:8032 read:81
      7015.219628290  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.219635873  READ_AUDIO_DONE                read_remainder:0
      7015.219637915  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:0
      7015.219690123  DEV_SLEEP_TIME                 dev:10 wake:000007015.224486165
      7015.219691873  SLEEP                          sleep:000000000.004796917 longest_wake:000335125
      7015.224568831  WAKE                           num_fds:0
      7015.224584873  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.224584290
      7015.224590415  READ_AUDIO                     dev:10 hw_level:7952 read:81
      7015.224594498  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.224597706  READ_AUDIO_DONE                read_remainder:0
      7015.224599165  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:1
      7015.224629206  DEV_SLEEP_TIME                 dev:10 wake:000007015.229486165
      7015.224630665  SLEEP                          sleep:000000000.004857542 longest_wake:000335125
      7015.229563331  WAKE                           num_fds:0
      7015.229575873  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.229575581
->    7015.229586956  READ_AUDIO                     dev:10 hw_level:16064 read:81
      7015.229589873  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.229593081  READ_AUDIO_DONE                read_remainder:0
      7015.229594248  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:0
      7015.229622831  DEV_SLEEP_TIME                 dev:10 wake:000007015.234486165
      7015.229623706  SLEEP                          sleep:000000000.004863917 longest_wake:000335125
      7015.234563373  WAKE                           num_fds:0
      7015.234575331  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.234574748
      7015.234585248  READ_AUDIO                     dev:10 hw_level:15984 read:81
      7015.234589915  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.234593123  READ_AUDIO_DONE                read_remainder:0
      7015.234594581  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:1
      7015.234622290  DEV_SLEEP_TIME                 dev:10 wake:000007015.239486165
      7015.234623165  SLEEP                          sleep:000000000.004864459 longest_wake:000335125
      7015.239563706  WAKE                           num_fds:0
      7015.239577706  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.239577123
      7015.239582081  READ_AUDIO                     dev:10 hw_level:15904 read:81
      7015.239585290  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.239588498  READ_AUDIO_DONE                read_remainder:0
      7015.239589665  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:0
      7015.239619706  DEV_SLEEP_TIME                 dev:10 wake:000007015.244486165
      7015.239620873  SLEEP                          sleep:000000000.004866750 longest_wake:000335125
      7015.244561998  WAKE                           num_fds:0
      7015.244574248  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.244573665
      7015.244581248  READ_AUDIO                     dev:10 hw_level:15824 read:81
      7015.244585623  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.244588831  READ_AUDIO_DONE                read_remainder:0
      7015.244589998  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:1
      7015.244617706  DEV_SLEEP_TIME                 dev:10 wake:000007015.249486165
      7015.244618581  SLEEP                          sleep:000000000.004868750 longest_wake:000335125
      7015.249562040  WAKE                           num_fds:0
      7015.249576040  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.249575456
->    7015.249580706  READ_AUDIO                     dev:10 hw_level:23936 read:81
      7015.249583623  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.249589456  READ_AUDIO_DONE                read_remainder:0
      7015.249590623  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:0
      7015.249620373  DEV_SLEEP_TIME                 dev:10 wake:000007015.254486165
      7015.249621248  SLEEP                          sleep:000000000.004866375 longest_wake:000335125


And and one point the hw_level climbed to 65536 and stopped there. But I guess it was still going up so CRAS wrote the error: pcm_avail returned frames larger than buf_size: rk3399-gru-sound: :0,4: 71088 > 65536 for 1 times


      7015.364499706  WAKE                           num_fds:0
      7015.364509915  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.364509331
      7015.364513998  READ_AUDIO                     dev:10 hw_level:63056 read:81
      7015.364516623  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.364519540  READ_AUDIO_DONE                read_remainder:0
      7015.364520706  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:1
      7015.364548415  DEV_SLEEP_TIME                 dev:10 wake:000007015.369486165
      7015.364549290  SLEEP                          sleep:000000000.004938042 longest_wake:000335125
      7015.369506456  WAKE                           num_fds:0
      7015.369645290  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.369644415
->    7015.369654331  READ_AUDIO                     dev:10 hw_level:65536 read:81
      7015.369662498  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.369666873  READ_AUDIO_DONE                read_remainder:0
      7015.369668331  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:0
      7015.369705081  DEV_SLEEP_TIME                 dev:10 wake:000007015.374486165
      7015.369707415  SLEEP                          sleep:000000000.004781667 longest_wake:000335125
      7015.374584081  WAKE                           num_fds:0
      7015.374616456  READ_AUDIO_TSTAMP              dev:10 tstamp:000007015.374615581
      7015.374627540  READ_AUDIO                     dev:10 hw_level:65536 read:81
      7015.374639498  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7015.374644165  READ_AUDIO_DONE                read_remainder:0
      7015.374645623  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:1
      7015.374685290  DEV_SLEEP_TIME                 dev:10 wake:000007015.379486165
      7015.374687331  SLEEP                          sleep:000000000.004801459 longest_wake:000335125


And when Ctrl+C is hit, the hotword stream/device is close and remaining samples are discarded.


      7017.909644581  WAKE                           num_fds:0
      7017.909687748  READ_AUDIO_TSTAMP              dev:10 tstamp:000007017.909685998
      7017.909706706  READ_AUDIO                     dev:10 hw_level:65536 read:81
      7017.909716915  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7017.909728873  READ_AUDIO_DONE                read_remainder:0
      7017.909735581  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:0
      7017.909823081  DEV_SLEEP_TIME                 dev:10 wake:000007017.914486165
      7017.909826290  SLEEP                          sleep:000000000.004664834 longest_wake:000344750
      7017.914628581  WAKE                           num_fds:0
      7017.914673498  READ_AUDIO_TSTAMP              dev:10 tstamp:000007017.914671748
      7017.914688956  READ_AUDIO                     dev:10 hw_level:65536 read:81
      7017.914699456  CAPTURE_WRITE                  stream:130000 write:80 shm_fr:0
      7017.914710831  READ_AUDIO_DONE                read_remainder:0
      7017.914715206  CAPTURE_POST                   stream:130000 thresh:80 rd_buf:1
      7017.914803873  DEV_SLEEP_TIME                 dev:10 wake:000007017.919486165
      7017.914807665  SLEEP                          sleep:000000000.004684042 longest_wake:000344750
      7017.916903290  WAKE                           num_fds:1
      7017.916941206  PB_MSG                         msg_id:6
      7017.916945290  STREAM_REMOVED                 id:130000
      7017.917058748  READ_AUDIO_TSTAMP              dev:10 tstamp:000007017.917056998
->    7017.917070415  READ_AUDIO                     dev:10 hw_level:65536 read:65536
      7017.917095790  READ_AUDIO_DONE                read_remainder:0
      7017.917139248  DEV_SLEEP_TIME                 dev:10 wake:000007037.917098123
      7017.917143040  SLEEP                          sleep:000000019.999960917 longest_wake:000344750
      7017.917230831  WAKE                           num_fds:1


hotword_dump_a.txt
354 KB View Download
Specifying larger --block_size to capture doesn't help because CRAS then uses a longer wake period to receive the data. The hw_level still stacks up much quicker.

Status: Started (was: Assigned)
It seems that my patch is not taking effect.
The audio log in #4 shows regular 4ms sleep time, while in CL:594762 for BULK_AUDIO_OK flag the sleep time is set to 1ms when buffer is too high.

I tested the two patches on CRAS before Dylan's dev_io patches. Maybe it got broken somehow after rebase, I'll test it again.

I found the problem.

When you run command "cras_test_client --pin 10 --capture_file /tmp/hot.wav --rate 16000 --num_channels 1" it creates a pinned stream to WoV node, but it just doesn't set the HOTWORD_STREAM flag.
(maybe we can add a warning when stream attach to Hotword node without this flag set)

The second patch I uploaded is to schedule faster input read for stream only for 'HOTWORD_STREAM' flag.
With my patches, you can just test by "cras_test_client --listen /tmp/1" which creates a hotword stream with correct flag, look up the hotword node and pin to it, using default 2-channel 48000 rate.

Please give it a try and let me know if there's still issue. Thanks!
Wow this is a subtle detail. Should we also automatically mark pinned streams to the hotword device as HOTWORD_STREAM?

Thanks Hsin-yu, I tested again with your patch and SoundTriggerTestApp which is the real hotword client.

I found that although we try to push the captured data as fast as possible to client, but somehow it cannot digest them.
From the logs, the frames were lost because the SoundTriggerTestApp was unable to read them fast enough, and thus the recorded sound is not smooth.


08-04 15:44:52.446    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.449    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.450    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.452    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.453    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.454    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.457    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.458    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.459    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.460    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.462    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.465    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
08-04 15:44:52.466    28  2066 D audio_hw_primary: in_read_cb: lost 128 frames
...



I tried again without the CRAS patch. The recorded sound is better, but still there are lost frames, I'll look into why frames are lost in audio HAL.

I just made some modification to the patches, mainly CL:594762

And yes if client side cannot digest data quick enough, that's another problem.

Line:685 is the key line, where capture_callback_fuzz_ts(1ms) is used to schedule consecutive read when data level is high.
(I think Kevin is slow so it probably work okay on Intel platform)

Can you help adjust the time like to 2ms, 3ms, to see what's the balance between "read burst data from device" and "client digest data"?


Status: Fixed (was: Started)
This is fixed per  issue 754581 

Comment 12 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Comment 13 by dchan@chromium.org, Jan 23 2018

Status: Fixed (was: Archived)

Sign in to add a comment