CopyOutputResult should be using SharedBitmapId instead of SkBitmap |
||||||||||||
Issue descriptionRight now, CopyOutputResult has a SkBitmap member which is going to be very costly to serialize and send over IPC. The SkBitmap should be replaced with a SharedBitmapId.
,
Feb 10 2017
,
Feb 10 2017
,
Feb 10 2017
Samans pointed out that we could use TextureMailbox for this except for the fact that it has a SharedBitmap* instead of a SharedBitmapId. So we could maybe consider making that change now as well. I think that was always the intention with TextureMailbox IIRC.
,
Feb 13 2017
,
Oct 24 2017
+miu@ for feedback, is this still an issue with your work?
,
Oct 24 2017
,
Oct 24 2017
the tab capture path doesn't use bitmap readbacks so this looks independent?
,
Oct 26 2017
Correct. I had a discussion with some mojo folks about this. Basically, whatever infrastructure we have for transferring SkBitmap data across processes (shared memory, data pipes, or other) should be used instead. It's probably worth taking a look at the mojo struct traits implementation for SkBitmap, and go from there.
,
Oct 26 2017
,
Feb 12 2018
Also worth considering: There's a mojo utility called "BigBuffer" that automatically chooses whether to send data inside a message or as a shmem handle + separate shmem: https://cs.chromium.org/chromium/src/mojo/public/cpp/base/big_buffer.h
,
Jul 14
Obsolete. The issue was fixed by someone else, and affects all SkBitmaps over mojo.
,
Jul 16
Can you explain how this is fixed or point to where the copy result SkBitmaps are using shared memory to avoid a big memory copy?
,
Jul 16
The CopyOutputResult::bitmap is a skia.mojom.Bitmap[1]. The Bitmap IDL declares that pixel data is contained in a mojo_base.majom.BigBuffer[2]. Then, BigBuffer struct traits will choose, at runtime, whether to transfer the data inline (in the mojo message), or on-the-side using shared memory[3]. 64KB is the threshold. 1: https://cs.chromium.org/chromium/src/services/viz/public/interfaces/compositing/copy_output_result.mojom?rcl=6f231e660a4c0e528d64852eeedb078f05ab8ad1&l=28 2: https://cs.chromium.org/chromium/src/skia/public/interfaces/bitmap.mojom?rcl=6f231e660a4c0e528d64852eeedb078f05ab8ad1&l=14 3: https://cs.chromium.org/chromium/src/mojo/public/cpp/base/big_buffer.cc?rcl=77da697591c660b08da393ed7ee6659752dd3c8e&l=95
,
Jul 16
It sounds like we still get a copy into the shared memory on the sender and a copy out on the receiver, which is not super different than it being part of the IPC message really except that the bytes don't go directly through an IPC pipe?
,
Jul 16
Correct. If >64KB go through the pipe, the concern is that this would affect the latency of all other IPCs too much: Meaning, it would block all other IPCs from VIZ-->browser while the large amount of pixel data is transferred. I'm not aware of the exact cost/performance analysis around all this, however. That'd be something the mojo core team would need to answer.
,
Jul 16
This bug was to have SoftwareRenderer do its "readback" directly into a sharedbitmap instead of an SkBitmap on the heap, so that we don't have to copy the fullscreen at least twice to ship the result back to the client.
,
Jul 17
OIC. So, there were really two issues here: The "too big over IPC" issue has been resolved. However, there is still some general optimization around "removing extra copies." It's helpful to review how things work now, as there are a few different code paths/use cases to consider: 1. GLRendererCopier + Video: GLRendererCopier's CopyOutputResult is already a one-shot "copy." It holds onto the GL pixel buffer object, and only glBind() and reads from it lazily, when downstream code calls ReadI420Planes() to copy the results directly from the PBO into mojo shmem. 2. GLRendererCopier + single snapshot: GLRendererCopier's CopyOutputResult is passed as input to the viz.mojom.CopyOutputResult struct traits. The struct traits call CopyOutputResult::AsSkBitmap(), which forces a "cached SkBitmap" to be created by copying from the PBO. Then, the skia.mojom.SkBitmap struct traits will copy the data AGAIN from the cached SkBitmap into mojo shmem. Potential solution: GLPixelBufferRGBAResult::AsSkBitmap() should be changed to NOT make a SkBitmap copy, but instead return a "facade" SkBitmap that just wraps the PBO data pointers. That way, the skia.mojom.SkBitmap struct traits would be the only thing making an actual copy, which would be directly from the PBO to the mojo shmem. 3. SoftwareRenderer: Whether for the video or single snapshot use case, we always make a SkBitmap by copying from the SkCanvas. Potential solution: We can eliminate the extra SkBitmap by implementing a subclass of CopyOutputResult that references the SkCanvas's pixmap directly (analogous to GLRendererCopier's direct-from-PBO scheme). I'll assign to myself to try-out these ideas and see if a few 10-liner code changes will do the trick. ;-)
,
Jul 17
If it's possible to give a pointer into a base::SharedMemory at the time of readback then we avoid copy into shm, and we can pass a SharedMemoryHandle instead of an SkBitmap in the CopyOutputResult, then wrap that into an SkBitmap on the client side if it wants to.
,
Jul 17
I had a brief discussion with rockot@ a while ago and he wasn't sure that allocating a shared memory would necessarily be better because the allocation has to go through the browser process. He suggested that we use BigBuffer. However, if the browser allocates shared memory and send it along with the CopyOutputRequest, it could be more efficient but I think we don't always know the memory size required for readback beforehand.
,
Jul 17
If my ideas (in #c18) pan out, we'll be able to just use the skia.mojom.SkBitmap struct traits, which automatically use BigBuffer underneath if needed. This will keep both the client- and server-side code simpler, since we won't have to account for two different answers to the "who provides the result memory" question throughout the pipeline. FWIW, in the single-snapshot cases, latency isn't terribly important, just the CPU/power usage aspect. So, it's okay if BigBuffer needs to go to the browser process to get some shmem. For video, we already do shmem pooling for performance.
,
Dec 18
|
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by samans@chromium.org
, Feb 10 2017