New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 774268 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 774288



Sign in to add a comment

On WMPI destruction, PostTask batches of demuxer teardown onto low-pri bg tasks to mitigate renderer hangs

Project Member Reported by wolenetz@chromium.org, Oct 12 2017

Issue description

This issue tracks solving issues like Renderer Hangs on Windows due to long demuxer teardowns on systems under heavy memory pressure (eg, plausible cause of things like bug 742696).

Specifically, one idea is to, on WMPI destruction, detach the media thread to let demuxer teardown continue without blocking renderer main.

This very likely could help more gracefully handle MSE WMPI+ChunkDemuxer teardowns, though perhaps SRC= allocations of buffered regions by WMPI+FFmpegDemuxer may also build up and benefit from this approach.

I'm uncertain if this work will fit in Q4; marking P3 to begin with.
 
Specifically it'd be great to remove this:

https://cs.chromium.org/chromium/src/media/base/pipeline_impl.cc?l=1041

I think we can do this by neutering the demuxers and data source on the main thread somehow, then passing everything over to the media thread for final teardown. Shouldn't need to worry about callbacks attached to WMPI since they are bound by WeakPtr.

Comment 2 by hubbe@chromium.org, Oct 12 2017

This might be obvious, but note that this also affect the data source. Basically the data source can't be destructed until the demuxer is done using it.

Cc: sande...@chromium.org
Blocking: 774288
Whether or not this helps issue 774288 depends on if the aligned_free blocks all memory allocations in the process :)

Comment 6 by nick@chromium.org, Oct 12 2017

Could we add an UMA that records the wall-clock time consumed by ~WebMediaPlayerImpl? Or maybe aggregates the times spent in ~WebMediaPlayerImpl over the lifetime of a document (I remember Dale had been working on some way to have one-shot, reliably reported, metrics like that)?

Another idea: can we create a synthetic torture-test type page that would contain a large number of media elements, each consuming gobs of memory in the buffers under ChunkDemuxerStream, and see if we can cause measurable hiccups on Windows?
Just a SCOPED_UMA_HISTOGRAM_TIMER() would be sufficient in this case; a single sample metric in unnecessary here since we'd never get here in the event of a process kill.
@#5 if demuxer teardown hard-faults the DecoderBuffers back in while free'ing them, then article (https://randomascii.wordpress.com/2014/12/10/hidden-costs-of-memory-allocation/) from issue 774288 indicates it could be 1 second / MB of freeing. Seems quite plausible that freeing could therefore block renderer_main for 10s of seconds in a system under heavy memory pressure if the media being freed was paged out to disk earlier.

Therefore, I'm not sure what you mean by "if the aligned_free blocks all memory allocations in the process" -- it at least could plausibly block all progress on renderer main thread.
Labels: -Pri-3 Pri-1
Back up to P1 while vetting the assumption that the underlying issue is hard-faults of 100's of MB of StreamParserBuffers.
Some anecdata: linux Z620, 100 elements each buffering about 84MB of media, all on one page: it took about 4 seconds to visually begin reload of page versus < 500ms for a much smaller number of elements and buffering.  This is without any significant memory pressure on the machine. Confirms plausibility of renderer hang being due to mass-aligned-free.
Keeping P1. We're considering a code workaround that could alleviate the slow destruction (PostTasking batches of destruction of the ChunkDemuxer objects to a low-pri background task in the pool).  I expect to begin work on that soon.
Summary: On WMPI destruction, PostTask batches of demuxer teardown onto low-pri bg tasks to mitigate renderer hangs (was: On WMPI destruction, detach the media thread to let demuxer teardown continue without blocking renderer main)
Labels: Stability-Sheriff-Desktop
Labels: M-65

Comment 15 by creis@chromium.org, Dec 14 2017

wolenetz@: Just checking in as stability sheriff about comment 11.  Do you think you'll have a chance to try that workaround soon, to see if it reduces the number of hang reports in issue 774288?

Comment 16 by mad@chromium.org, Dec 19 2017

Ping?

Going through the list of items from the Stability-Sheriff list that had no progress this week.

@#15-16, yep, I'm working on a fix right now.

Comment 19 by lfg@chromium.org, Jan 2 2018

Labels: -Stability-Sheriff-Desktop
Can we just dupe this into issue 774288 (or the other way around)?
@#19 - I kept these separate because issue 774288 needs visibility restriction due to details contained within, while this issue doesn't and helps provide transparency longer-term into the fix.
Status: Started (was: Assigned)
Project Member

Comment 22 by bugdroid1@chromium.org, Jan 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/95af6365f9aedc89162d3f7efe4fb7453f242cb2

commit 95af6365f9aedc89162d3f7efe4fb7453f242cb2
Author: Matt Wolenetz <wolenetz@chromium.org>
Date: Thu Jan 04 22:44:13 2018

MSE: Destroy WMPI's ChunkDemuxer object using a background task

This change helps avoid renderer hang crashes on Windows by doing the
potentially costly destruction of a ChunkDemuxer object on a background
task.  The duration of that destruction, not including any initial
scheduling delay, is recorded in a new histogram:
"Media.MSE.DemuxerDestructionTime".

BUG= 774268 ,774288, 796704 

Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Change-Id: I77dcb3ac1f7d5c5777f5302a7fc87199bf693d74
Reviewed-on: https://chromium-review.googlesource.com/838701
Commit-Queue: Matthew Wolenetz <wolenetz@chromium.org>
Reviewed-by: Robert Liao <robliao@chromium.org>
Reviewed-by: Dan Sanders <sandersd@chromium.org>
Reviewed-by: Steven Holte <holte@chromium.org>
Cr-Commit-Position: refs/heads/master@{#527125}
[modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/media/blink/webmediaplayer_impl.cc
[modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/media/blink/webmediaplayer_impl.h
[modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/media/filters/chunk_demuxer_unittest.cc
[modify] https://crrev.com/95af6365f9aedc89162d3f7efe4fb7453f242cb2/tools/metrics/histograms/histograms.xml

Status: Fixed (was: Started)
Data (https://bugs.chromium.org/p/chromium/issues/detail?id=774288#c29) from canaries since #22 indicates that #22 has worked-around the crash.

Bug 797024 tracks follow-up work to optimize away the root cause (expensive MSE resource deallocation), which would also fix another source of MSE renderer hang on windows (ChunkDemuxer::RemoveId() is crashing at very low rate still probably due to similar problem of expensive MSE resource deallocation).
Project Member

Comment 24 by bugdroid1@chromium.org, Feb 2 2018

Labels: merge-merged-3282
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/12d7ef018b9875e805960378e68048cb3e0b4dca

commit 12d7ef018b9875e805960378e68048cb3e0b4dca
Author: Matt Wolenetz <wolenetz@chromium.org>
Date: Fri Feb 02 01:43:09 2018

To M64: MSE: Destroy WMPI's ChunkDemuxer object using a background task

This change helps avoid renderer hang crashes on Windows by doing the
potentially costly destruction of a ChunkDemuxer object on a background
task.  The duration of that destruction, not including any initial
scheduling delay, is recorded in a new histogram:
"Media.MSE.DemuxerDestructionTime".

BUG= 774268 ,774288, 796704 
TBR=wolenetz@chromium.org, robliao@chromium.org, holte@chromium.org, dalecurtis@chromium.org

(cherry picked from commit 95af6365f9aedc89162d3f7efe4fb7453f242cb2)

Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Change-Id: I77dcb3ac1f7d5c5777f5302a7fc87199bf693d74
Reviewed-on: https://chromium-review.googlesource.com/838701
Commit-Queue: Matthew Wolenetz <wolenetz@chromium.org>
Reviewed-by: Robert Liao <robliao@chromium.org>
Reviewed-by: Dan Sanders <sandersd@chromium.org>
Reviewed-by: Steven Holte <holte@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#527125}
Reviewed-on: https://chromium-review.googlesource.com/898077
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Matthew Wolenetz <wolenetz@chromium.org>
Cr-Commit-Position: refs/branch-heads/3282@{#635}
Cr-Branched-From: 5fdc0fab22ce7efd32532ee989b223fa12f8171e-refs/heads/master@{#520840}
[modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/media/blink/webmediaplayer_impl.cc
[modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/media/blink/webmediaplayer_impl.h
[modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/media/filters/chunk_demuxer_unittest.cc
[modify] https://crrev.com/12d7ef018b9875e805960378e68048cb3e0b4dca/tools/metrics/histograms/histograms.xml

Sign in to add a comment