We should soon get NV12 framebuffer support in 4.4 kernels for i915 where available.
Once we get NV12 framebuffer support in 4.4 and we land NV12 output for VAAPI decoders with crrev.com/c/569144, we'll be closer to a zero-copy hardware decoded video playback path on VAAPI devices.
The last copy that is happening during video playback with VAAPI is in VaapiVideoDecodeAccelerator::OutputPicture, during the DownloadFromSurface call, where a surface is blitted/converted from a vaSurface to an output picture.
If the format of the vaSurface produced by the decoder is the same as the one expected form the client of VaapiVideoDecodeAccelerator (likely with NV12), we might be able to skip that blit completely.
Kristian just commented out the extra blit to get a rough idea of power savings achievable and his estimate of best case power savings was in the order of .2 watts.
An additional benefit of removing the blit, not less important than the power saving, is that it is currently happening on the GPU main thread, sometimes stalling GPU compositing for big videos (crbug.com/717265).
Comment 1 by posciak@chromium.org
, Sep 7 2017