On WMPI destruction, PostTask batches of demuxer teardown onto low-pri bg tasks to mitigate renderer hangs |
|||||||||||
Issue descriptionThis issue tracks solving issues like Renderer Hangs on Windows due to long demuxer teardowns on systems under heavy memory pressure (eg, plausible cause of things like bug 742696). Specifically, one idea is to, on WMPI destruction, detach the media thread to let demuxer teardown continue without blocking renderer main. This very likely could help more gracefully handle MSE WMPI+ChunkDemuxer teardowns, though perhaps SRC= allocations of buffered regions by WMPI+FFmpegDemuxer may also build up and benefit from this approach. I'm uncertain if this work will fit in Q4; marking P3 to begin with.
,
Oct 12 2017
This might be obvious, but note that this also affect the data source. Basically the data source can't be destructed until the demuxer is done using it.
,
Oct 12 2017
,
Oct 12 2017
,
Oct 12 2017
Whether or not this helps issue 774288 depends on if the aligned_free blocks all memory allocations in the process :)
,
Oct 12 2017
Could we add an UMA that records the wall-clock time consumed by ~WebMediaPlayerImpl? Or maybe aggregates the times spent in ~WebMediaPlayerImpl over the lifetime of a document (I remember Dale had been working on some way to have one-shot, reliably reported, metrics like that)? Another idea: can we create a synthetic torture-test type page that would contain a large number of media elements, each consuming gobs of memory in the buffers under ChunkDemuxerStream, and see if we can cause measurable hiccups on Windows?
,
Oct 12 2017
Just a SCOPED_UMA_HISTOGRAM_TIMER() would be sufficient in this case; a single sample metric in unnecessary here since we'd never get here in the event of a process kill.
,
Oct 12 2017
@#5 if demuxer teardown hard-faults the DecoderBuffers back in while free'ing them, then article (https://randomascii.wordpress.com/2014/12/10/hidden-costs-of-memory-allocation/) from issue 774288 indicates it could be 1 second / MB of freeing. Seems quite plausible that freeing could therefore block renderer_main for 10s of seconds in a system under heavy memory pressure if the media being freed was paged out to disk earlier. Therefore, I'm not sure what you mean by "if the aligned_free blocks all memory allocations in the process" -- it at least could plausibly block all progress on renderer main thread.
,
Oct 13 2017
Back up to P1 while vetting the assumption that the underlying issue is hard-faults of 100's of MB of StreamParserBuffers.
,
Oct 13 2017
Some anecdata: linux Z620, 100 elements each buffering about 84MB of media, all on one page: it took about 4 seconds to visually begin reload of page versus < 500ms for a much smaller number of elements and buffering. This is without any significant memory pressure on the machine. Confirms plausibility of renderer hang being due to mass-aligned-free.
,
Dec 5 2017
Keeping P1. We're considering a code workaround that could alleviate the slow destruction (PostTasking batches of destruction of the ChunkDemuxer objects to a low-pri background task in the pool). I expect to begin work on that soon.
,
Dec 5 2017
,
Dec 5 2017
,
Dec 6 2017
,
Dec 14 2017
wolenetz@: Just checking in as stability sheriff about comment 11. Do you think you'll have a chance to try that workaround soon, to see if it reduces the number of hang reports in issue 774288?
,
Dec 19 2017
Ping? Going through the list of items from the Stability-Sheriff list that had no progress this week.
,
Dec 20 2017
@#15-16, yep, I'm working on a fix right now.
,
Dec 21 2017
,
Jan 2 2018
Can we just dupe this into issue 774288 (or the other way around)?
,
Jan 4 2018
@#19 - I kept these separate because issue 774288 needs visibility restriction due to details contained within, while this issue doesn't and helps provide transparency longer-term into the fix.
,
Jan 4 2018
,
Jan 4 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/95af6365f9aedc89162d3f7efe4fb7453f242cb2 commit 95af6365f9aedc89162d3f7efe4fb7453f242cb2 Author: Matt Wolenetz <wolenetz@chromium.org> Date: Thu Jan 04 22:44:13 2018 MSE: Destroy WMPI's ChunkDemuxer object using a background task This change helps avoid renderer hang crashes on Windows by doing the potentially costly destruction of a ChunkDemuxer object on a background task. The duration of that destruction, not including any initial scheduling delay, is recorded in a new histogram: "Media.MSE.DemuxerDestructionTime". BUG= 774268 ,774288, 796704 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I77dcb3ac1f7d5c5777f5302a7fc87199bf693d74 Reviewed-on: https://chromium-review.googlesource.com/838701 Commit-Queue: Matthew Wolenetz <wolenetz@chromium.org> Reviewed-by: Robert Liao <robliao@chromium.org> Reviewed-by: Dan Sanders <sandersd@chromium.org> Reviewed-by: Steven Holte <holte@chromium.org> Cr-Commit-Position: refs/heads/master@{#527125} [modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/media/blink/webmediaplayer_impl.cc [modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/media/blink/webmediaplayer_impl.h [modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/media/filters/chunk_demuxer_unittest.cc [modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/tools/metrics/histograms/histograms.xml
,
Jan 8 2018
Data (https://bugs.chromium.org/p/chromium/issues/detail?id=774288#c29) from canaries since #22 indicates that #22 has worked-around the crash. Bug 797024 tracks follow-up work to optimize away the root cause (expensive MSE resource deallocation), which would also fix another source of MSE renderer hang on windows (ChunkDemuxer::RemoveId() is crashing at very low rate still probably due to similar problem of expensive MSE resource deallocation).
,
Feb 2 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/12d7ef018b9875e805960378e68048cb3e0b4dca commit 12d7ef018b9875e805960378e68048cb3e0b4dca Author: Matt Wolenetz <wolenetz@chromium.org> Date: Fri Feb 02 01:43:09 2018 To M64: MSE: Destroy WMPI's ChunkDemuxer object using a background task This change helps avoid renderer hang crashes on Windows by doing the potentially costly destruction of a ChunkDemuxer object on a background task. The duration of that destruction, not including any initial scheduling delay, is recorded in a new histogram: "Media.MSE.DemuxerDestructionTime". BUG= 774268 ,774288, 796704 TBR=wolenetz@chromium.org, robliao@chromium.org, holte@chromium.org, dalecurtis@chromium.org (cherry picked from commit 95af6365f9aedc89162d3f7efe4fb7453f242cb2) Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I77dcb3ac1f7d5c5777f5302a7fc87199bf693d74 Reviewed-on: https://chromium-review.googlesource.com/838701 Commit-Queue: Matthew Wolenetz <wolenetz@chromium.org> Reviewed-by: Robert Liao <robliao@chromium.org> Reviewed-by: Dan Sanders <sandersd@chromium.org> Reviewed-by: Steven Holte <holte@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#527125} Reviewed-on: https://chromium-review.googlesource.com/898077 Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Reviewed-by: Matthew Wolenetz <wolenetz@chromium.org> Cr-Commit-Position: refs/branch-heads/3282@{#635} Cr-Branched-From: 5fdc0fab22ce7efd32532ee989b223fa12f8171e-refs/heads/master@{#520840} [modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/media/blink/webmediaplayer_impl.cc [modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/media/blink/webmediaplayer_impl.h [modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/media/filters/chunk_demuxer_unittest.cc [modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/tools/metrics/histograms/histograms.xml |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by dalecur...@chromium.org
, Oct 12 2017