Up minimum hardware decode resolution to 360p. |
||||||||
Issue descriptionData for Android suggested anything below this was not worthwhile. Likely we could even go as high as 720p for desktop if I recall the data correctly. We see lots of sites spinning up hardware decoders for 16x16 or smaller clips; this feels excessive. At least on Android we have data suggesting it's power inefficient to use the hardware decoder for smaller resolutions. Do we have similar analysis on other platforms?
,
Jan 24 2017
Good point about MSE based streams adapting upwards. We do have a way for demuxers to signal that they are expecting config changes, so we could apply this only to src= playback. We'd need to reorganize either how we select decoders or include the config change bit in the decoder config.
,
Jan 25 2017
There is also the scenario where we have many (possibly small) resolution decoders: multiple videos on one page, or multiple pages with videos to consider. Also, for Hangouts and VC in general, small streams of all people connected to the meeting; even with 180p, if we have many of these, we'd probably still prefer using HW decode then. These also go up and down in resolution as they get focused. All that goes through RTCVD though. We could also set a larger min resolution in VDA::SupportedProfiles per platform/VDA.
,
Mar 31 2017
,
Mar 31 2017
,
Mar 31 2017
posciak@ are you sure you'd want to hardware decode those at 180p? Even on Android we've found that the lower bound for power/cpu efficiency is around 360p; anything below that used more power with hardware.
,
Apr 24 2017
Probably not. However, what would happen when/if the stream switched up to higher resolutions (and back) for both VC and HTML5 video playback scenarios? The overhead of switching back and forth between HW/SW decode could then be a factor. Would this be happening in any cases here?
,
Apr 24 2017
For HTML5, we'd just switch back and forth as necessary; so long as the switch takes less 4x frames or occurs when not playing, it won't be noticeable to the user. For RTC, I think it already handles switching from software to hardware: https://cs.chromium.org/chromium/src/content/renderer/media/gpu/rtc_video_decoder.cc?l=221 https://cs.chromium.org/chromium/src/third_party/webrtc/media/engine/videodecodersoftwarefallbackwrapper.cc
,
Jul 17
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/fbf1006e21d5ec242a5c0cc18d546aa12af214e3 commit fbf1006e21d5ec242a5c0cc18d546aa12af214e3 Author: Dan Sanders <sandersd@chromium.org> Date: Tue Jul 17 19:56:48 2018 [media] Add media::*Decoder::IsPlatformDecoder(). Add IsPlatformDecoder() API to media::VideoDecoder, and also to media::AudioDecoder for symmetry. This value will be used by media::DecoderStream to decide when to fall back to software decode for low-resolution video. Bug: 684792 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: I778ebdcaa84a9cabfc74be9729969db53b2aad7a Reviewed-on: https://chromium-review.googlesource.com/1139097 Commit-Queue: Dan Sanders <sandersd@chromium.org> Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Cr-Commit-Position: refs/heads/master@{#575749} [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/content/browser/media/media_internals.cc [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/base/audio_decoder.cc [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/base/audio_decoder.h [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/base/video_decoder.cc [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/base/video_decoder.h [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/filters/decoder_stream.cc [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/filters/gpu_video_decoder.cc [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/filters/gpu_video_decoder.h [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/mojo/clients/mojo_audio_decoder.cc [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/mojo/clients/mojo_audio_decoder.h [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/mojo/clients/mojo_video_decoder.cc [modify] https://crrev.com/fbf1006e21d5ec242a5c0cc18d546aa12af214e3/media/mojo/clients/mojo_video_decoder.h
,
Jul 18
Switching from SW to HW decoder mid-stream is usually a costly operation, which may include opening the device nodes and setting up the drivers, negotiating formats and parameters with the drivers, setting up hardware codec, powering it on, loading and initializing firmware, setting up working buffer set, programming IOMMUs, clearing caches, etc. In addition, depending on the resolution threshold and the particular platform, we may still be able to save power/have better smoothness even at low resolutions (e.g. 360p) with HW decoders. Unless we would be able to somehow infer that the stream would never switch into higher resolutions mid-playback (or, especially, change resolutions causing multiple SW-HW decoder switches), which to my knowledge may not be possible in most cases, I think we may need to be especially careful before enabling this optimization, especially on non-desktop. Would it perhaps be an option to allow the platform decoder to inform the client (e.g. via a static capability call) whether SW fallback should be done or not at all, and for what resolutions if so?
,
Jul 18
Generally agreed with what Pawel said before. I'd also suggest that we don't over-generalize based on some Android testing, because there is a big variance between different hardware and software platforms: - different codec driver interfaces - VAAPI, V4L2, Stateless V4L2, with potentially different overheads and initialization times, - different CPU performance/power characteristics - not only ARM vs x86, but also variance between particular ARM implementations, - integration with graphics stack - is it always possible to composite from software-decoded videos directly, without texture uploads, as we do with hardware-decoded videos? and probably more. If we decide that this is an optimization worthwhile on non-desktop (specifically Chrome OS) systems, we would definitely want to have some knobs to tune the behavior to particular hardware platforms.
,
Jul 18
,
Jul 19
How many cases of switching resolution mid-stream up from thumbnail do we see that are not WebRTC use cases? IOW I think the majority of non-RTC video playback cases would start at least at 240p and move up from it (think YT). Only RTC would progress from a stamp size up to 720p. GPU could provide the minimum desired resolution for a given hw deco as a hint, RTC would ignore it and the "normal" video playback honour it. Does it make sense?
,
Jul 19
It's currently the case that if a stream dips below a hardware decoder's minimum resolution, we fall back to software decode and never transition back to hardware decode even if the resolution increases. We know that on Mac OS X, we don't want to use the hardware decoder below 480p, but actually specifying that in SupportedProfiles would trigger this issue constantly. (Basically all YouTube playbacks.) While we intend to experiment to determine if we can set higher thresholds to get power savings without introducing unacceptable latency, the goal of current work is specifically to implement the 'fall forward' feature.
,
Jul 19
On the latest gen Intel CrOs (soraka, eve etc), for 720p VP9 we see that the power consumption of the Sw decoder (VpxVideoDecoder) is lower than the Hw decoder (both of them with overlays engaged), see the _pkg_pwr metrics in: https://chromeperf.appspot.com/report?sid=578d94b340109e518b217c266f04fde1867ebbf4d8df659f572f67b5e487a527
,
Jul 22
It looks like hw decode bumps gfx power significantly, which I guess is expected, since the decoder on Intel platforms is a part of the GPU. On the other hand, it might be interesting to check if there isn't any inefficiency (or bugs) in GPU runtime PM under light (or non-gfx) loads (decode + hw overlays). I tried looking at some ARM boards in that test, but couldn't find any that included power consumption results. I wouldn't be surprised if it was much more efficient there, because codecs on ARM platforms are normally separate IP blocks that don't require GPU to be running to operate and use V4L2 drivers that are normally designed towards proper runtime PM.
,
Aug 15
,
Aug 15
Do you want to use this bug for that block? if so we need to make this Pri-1 M70.
,
Aug 15
I should probably split into a new bug, but I didn't want to lose the context in this thread.
,
Aug 24
,
Aug 24
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by sande...@chromium.org
, Jan 24 2017