New issue
Advanced search Search tips

Issue 742948 link

Starred by 1 user

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

CRAS: crash in destroying format converter

Project Member Reported by cychiang@chromium.org, Jul 14 2017

Issue description

Sample reports:

https://crash.corp.google.com/browse?q=product.name%3D%27ChromeOS%27%20AND%20product.version%3D%279592.3.0%27%20AND%20exec_name%3D%27cras%27%20AND%20stable_signature%3D%27raise-527ce550%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D

There were 126 reports filed by the same client id. So it must be easy to reproduce under certain repo steps.

There are some client id reports large number of crash reports.
Group by client id:

https://crash.corp.google.com/dremel_query_ui?q=%0ASELECT%20COUNT(*)%20AS%20num%2CClientID%2CProduct.Version%20FROM%20crash.prod.latest%20WHERE%20product.name%3D%27ChromeOS%27%20AND%20exec_name%3D%27cras%27%20AND%20stable_signature%3D%27raise-527ce550%27%20OMIT%20RECORD%20IF%20SUM(CrashedStackTrace.StackFrame.FunctionName%3D%27cras_fmt_conv_destroy%27)%20%3D%200%20GROUP%20BY%20ClientID%2C%20Product.Version%20ORDER%20BY%20Product.Version%20DESC

The crash is not specific to certain board.
Group by board:

https://crash.corp.google.com/dremel_query_ui?q=%0ASELECT%20COUNT(hwclass)%20AS%20num%2C%20hwclass%20FROM%20crash.prod.latest%20WHERE%20product.name%3D%27ChromeOS%27%20AND%20exec_name%3D%27cras%27%20AND%20stable_signature%3D%27raise-527ce550%27%20OMIT%20RECORD%20IF%20SUM(CrashedStackTrace.StackFrame.FunctionName%3D%27cras_fmt_conv_destroy%27)%20%3D%200%20GROUP%20BY%20hwclass%20ORDER%20BY%20num%20DESC


Only AMD64 boards have this issue:
https://crash.corp.google.com/dremel_query_ui?q=%0ASELECT%20COUNT(hwclass)AS%20num%2C%20cpu.architecture%20FROM%20crash.prod.latest%20WHERE%20product.name%3D%27ChromeOS%27%20AND%20exec_name%3D%27cras%27%20AND%20stable_signature%3D%27raise-527ce550%27%20OMIT%20RECORD%20IF%20SUM(CrashedStackTrace.StackFrame.FunctionName%3D%27cras_fmt_conv_destroy%27)%20%3D%200%20GROUP%20BY%20cpu.architecture%20ORDER%20BY%20num%20DESC



void cras_fmt_conv_destroy(struct cras_fmt_conv *conv)
{
        unsigned i;
        if (conv->ch_conv_mtx)
                cras_channel_conv_matrix_destroy(conv->ch_conv_mtx,
                                                 conv->out_fmt.num_channels);
        if (conv->speex_state)
                speex_resampler_destroy(conv->speex_state);
        if (conv->resampler)
                linear_resampler_destroy(conv->resampler);
        for (i = 0; i < MAX_NUM_CONVERTERS - 1; i++)
                free(conv->tmp_bufs[i]);                     -------------------------> Crash here
        free(conv);
}

The only place to free conv->tmp_bufs[i] is in cras_fmt_conv_destroy.


These are the caller of cras_fmt_conv_destroy:

7 in cras_fmt_conv_create (not possible to cause crash)
2 in audio_thread to destroy global remix converter. (not related)
1 in dev_stream_destroy

So the only case is that a dev_stream is destroyed twice.


void dev_stream_destroy(struct dev_stream *dev_stream)
{
        cras_rstream_dev_detach(dev_stream->stream, dev_stream->dev_id);
        if (dev_stream->conv) {
                cras_audio_area_destroy(dev_stream->conv_area);           -------------------> Not sure why it did not crash here
                cras_fmt_conv_destroy(dev_stream->conv);               ---------------------> Crash here
                byte_buffer_destroy(dev_stream->conv_buffer);
        }
        free(dev_stream);
}

Audio thread is the only thread that calls dev_stream_destroy.
There are three places:

https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/server/audio_thread.c?q=dev_stream_destroy+package:%5Echromeos_public$&dr=C&l=224

They are all happened with cras_iodev_rm_stream. That is, remove the stream from the list in dev, and destroy the stream.


The crash happened through this path:

static void delete_stream_from_dev(struct cras_iodev *dev,
                                   struct cras_rstream *stream)
{
        struct dev_stream *out;

        out = cras_iodev_rm_stream(dev, stream);
        if (out)
                dev_stream_destroy(out);
}

Here cras_iodev_rm_stream searched for the stream and it indeed found the stream in the list.
So I could not think of a case where a dev_stream can be destroyed.


A possible way to avoid crash is to set dev_stream->conv to NULL in the end of dev_stream_destroy.
But it will be masking the real problem.
 

Comment 1 by dgreid@chromium.org, Jul 14 2017

This call to free was probably valid.  Maybe the heap got corrupted before somehow?

Comment 2 by dgreid@chromium.org, Jul 15 2017

I spent a good amount of time on this today and didn't see anything obviously causing it.  I went through th echanges in 59:
$ git log --oneline cros/release-R58-9334.B..cros/release-R59-9460.B
cb47688d (cros/stabilize-9460.66.B, cros/release-R59-9460.B) daisy: Turn off speaker switch when headphone is enabled
85e70ffd (cros/stabilize-9460.60.B) CRAS: iodev - Skip update_channel_layout for 2-channel output.
908b07ea (cros/stabilize-9460.40.B, cros/stabilize-9460.23.B) CRAS: alsa_mixer - Fix cras_alsa_mixer_set_mute
59352451 (cros/stabilize-9460.4.B) audio_thread_log_viewer: Fix ODEV_NO_STREAMS keyword
f614b5c7 CRAS: remove non-critical logs in libcras and cras_util
0c7b27b7 CRAS: iodev_list_unittest - update active node for pinned stream
2a609e20 (cros/stabilize-9430.B, cros/stabilize-9428.B) CRAS: iodev_list - Enable/disable active node for pinned stream
b8ad07f9 chell: Disable DSP on both HDMI devices
53efd276 CRAS: alsa_jack - Remove callback of hctl element on destroy
f62f24a6 CRAS: audio_thread - Use wake up time from dev_stream
9903383a CRAS: dev_stream - Add method to get wake up time
797b23b6 CRAS: dev_stream - Avoid bursting captured data to stream
bd7dee79 CRAS: test_client - Add log for number of overruns of streams
a260b3aa CRAS: test_client - Display device ID in decimal

Maybe the ignore default channel update one?  It seems OK but I'm not sure how that plays with format converter allocation.

It all looks innocent enough.

Did any libraries update?
85e70ffd (cros/stabilize-9460.60.B) CRAS: iodev - Skip update_channel_layout for 2-channel output.
This looks innocent because it only sets the format on the device.
The format converter is created when a dev_stream is created.
( config_format_converter is called in dev_stream_create. )


This crash happened before R59 so we need to look into older branches.

I used the query which search for crash at line cras_fmt_conv_destroy at cras_fmt_conv.c line 609:

https://crash.corp.google.com/dremel_query_ui?q=%0ASELECT%20COUNT(*)%20AS%20num%2CProduct.Version%20FROM%20crash.prod.latest%20WHERE%20product.name%3D%27ChromeOS%27%20AND%20exec_name%3D%27cras%27%20%20OMIT%20RECORD%20IF%20SUM(CrashedStackTrace.StackFrame.FunctionName%3D%27cras_fmt_conv_destroy%27)%3D0%20OR%20%20SUM(CrashedStackTrace.StackFrame.SourceLine%3D609)%20%20%3D%200%20GROUP%20BY%20Product.Version%20ORDER%20BY%20Product.Version%20DESC

query with client id:

https://crash.corp.google.com/dremel_query_ui?q=%0ASELECT%20COUNT(*)%20AS%20num%2CClientID%2CProduct.Version%20FROM%20crash.prod.latest%20WHERE%20product.name%3D%27ChromeOS%27%20AND%20exec_name%3D%27cras%27%20%20OMIT%20RECORD%20IF%20SUM(CrashedStackTrace.StackFrame.FunctionName%3D%27cras_fmt_conv_destroy%27)%3D0%20OR%20%20SUM(CrashedStackTrace.StackFrame.SourceLine%3D609)%20%20%3D%200%20GROUP%20BY%20ClientID%2C%20Product.Version%20ORDER%20BY%20Product.Version%20DESC

This crash happened back in R55-8872.73.0 (7647.84 seems different), and had a peak at R56-9000.91 with 1183 reports.

The difference between R54 and R55:


$ git log --oneline cros/release-R54-8743.B..cros/release-R55-8872.B

ea116ee6 CRAS: loopbacl_iodev - Fix get_buffer callback
c9e53a0e CRAS: alsa_jack - Use different timeout for different type of jacks.
9c9ed3da CRAS: alsa_helpers: Remove pcm_avail underrun logs
763a36cb CRAS: Add --disable_profile option to control SDP profile
23f28abe CRAS: fix various warnings reported by valgrind
5307a775 CRAS: dsp_ini - Fix invalid memory access
8fc3b5e1 CRAS: iodev_list - Accept stream connect when seeing -ENOENT err
6b16c791 CRAS: iodev - Fix no stream playback state transition
31367f29 CRAS: hfp - Use dummy timestamp for last buffer change
c2ae60ca Revert "CRAS: alsa_jack: report jack status right away for hctl jack"
0213d7f6 CRAS: file_wait - Use access() to check file existence.
4c73b057 CRAS: libcras - Fix build errors on Android.
1f260965 CRAS: alsa_io - Check for set sample rate in UCM
ee88be2e CRAS: alsa_io - Move supported format check
a812f607 CRAS: alsa_io - Use update_supported_formats from create
fe288559 CRAS: ucm - Parse PlaybackRate and CaptureRate
4d988ad9 CRAS: Set a default hotword model on iodev init
6235e871 CRAS: test_client: Show current hotword model name
9d76654d CRAS: dbus_control: Allow get/set hotword model via dbus
0876b342 CRAS: bt_device - Reset connected profiles when device is disconnected
fdbff273 CRAS: bt_device: Start connection watcher after profile update
91e434c4 CRAS: dsp_ini: Add a swap_lr plugin before sink plugin
c9f56d93 CRAS: iodev - Add function to swap left and right channel using dsp
4f86aad3 CRAS: dsp - Add function to set a boolean variable
9e6d5836 CRAS: dps_mod_builtin - Add a swap_lr built-in module
8213253c ucm-config: cyan: Add speaker swap mode
dab3c96c CRAS: hfp_ag_profile - Start audio gateway in bt device
8041030f CRAS: Rename PeriodFrames to DmaPeriodMicrosecs.
6babffc0 CRAS: Add UCM flag to enable use of ALSA htimestamp.
fa4cd976 CRAS: Use the hardware timestamp with available/used frames.
654d8e6e CRAS: empty_iodev: Update last_buffer_access for flush_buffer.
635cce42 CRAS: alsa_io - Don't unmute a node when disabling an device
cb2443d8 CRAS: bt_device - Start A2DP and HFP when both are ready
3b611c7f CRAS: a2dp_endpoint - Rename and refactor start function
86e32237 CRAS: bt_device - Add connection watch timer
7aec9938 CRAS: alsa_ucm_unittest: Fix TEST label.
595b4155 CRAS: iodev - Add method to get number of underruns
39c60b6d CRAS: iodev - Add log for state transition and filling zeros
f6120d16 CRAS: hfp_ag - Delete a2dp delay timer
069a3d32 CRAS: bt_transport - Remove unused functions
2232117d CRAS: bt_device - Migrate a2dp suspend timer to bt_device
dc3ecc8a CRAS: iodev - Move no stream state transition into cras_iodev
50cf4aec CRAS: iodev - Add state variable in cras_iodev
94e6f4da cras: Adding systemd unit files.




Project Member

Comment 4 by bugdroid1@chromium.org, Jul 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/adhd/+/661ebed34c729654fc6c0347add07a164c9ef987

commit 661ebed34c729654fc6c0347add07a164c9ef987
Author: Dylan Reid <dgreid@chromium.org>
Date: Tue Jul 18 07:32:39 2017

CRAS: fmt_conf - Have destructor set pointer to NULL

Force the caller of cras_fmt_destroy to pass a pointer to the pointer
that holds the format converter to free.  This pointer will be set to
NULL, making a double-free or use-after-free more difficult.

BUG=742948
TEST=add/remove some streams

Change-Id: I8a57092f92ece03e043472bce55852b7b009e864
Signed-off-by: Dylan Reid <dgreid@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/572162
Reviewed-by: Cheng-Yi Chiang <cychiang@chromium.org>
Reviewed-by: Chinyue Chen <chinyue@chromium.org>

[modify] https://crrev.com/661ebed34c729654fc6c0347add07a164c9ef987/cras/src/server/cras_fmt_conv.h
[modify] https://crrev.com/661ebed34c729654fc6c0347add07a164c9ef987/cras/src/tests/fmt_conv_unittest.cc
[modify] https://crrev.com/661ebed34c729654fc6c0347add07a164c9ef987/cras/src/server/cras_fmt_conv.c
[modify] https://crrev.com/661ebed34c729654fc6c0347add07a164c9ef987/cras/src/server/dev_stream.c
[modify] https://crrev.com/661ebed34c729654fc6c0347add07a164c9ef987/cras/src/server/audio_thread.c
[modify] https://crrev.com/661ebed34c729654fc6c0347add07a164c9ef987/cras/src/tests/audio_thread_unittest.cc

Sign in to add a comment