Issue metadata
Sign in to add a comment
|
texImage2D from getUserMedia video much slower than H.264 video file on Android
Reported by
si...@zappar.com,
Jul 25
|
||||||||||||||||||||||
Issue descriptionSteps to reproduce the problem: I've uploaded the test case to here for easier testing: https://tango-bravo.net/WebGLVideoPerformance/perftest.html When playing a 720p MP4 on Chrome Android 67, I get around 2ms for the texImage2D call on my Samsung Galaxy S6. When switching to getUserMedia with 720p constraints, upload time jumps to over 18ms. On Desktop OS X Chrome gives timing for both cases (roughly 0.6ms from video, and 0.5 from user media). What is the expected behavior? What went wrong? I suspect Chrome is using some slow path internally for user media videos, whereas has a GPU-only path for the mp4 case. Did this work before? N/A Does this work in other browsers? N/A Chrome version: 67.0.3396.87 Channel: n/a OS Version: Flash Version: Some additional background - now getUserMedia, WebGL and WebAssembly are supported on both Android and iOS, I'm investigating whether the web could be a suitable runtime platform for our Augmented Reality platform. Our image tracking algorithms are relatively efficient and can run at real-time on mobile via WebAssembly, but they are designed for greyscale-only images. In a native app we get YUV from the camera and can process the Y plane directly, whereas on the web as far as I can tell RGB is the only option. I was hoping texImage2D from a <video> would provide a hardware-accelerated and relatively low-overhead route to getting the latest frame into a texture, which would allow greyscale conversion (and half-sampling) to happen in a shader. I'd be happy to wait a frame before reading-back the data on the WebAssembly side, to minimise the GPU-CPU sync point of the readPixels [and with WebGL 2 fence sync objects and using a pixel buffer object for the readPixels would also be something I'd look into]. Unfortunately the overhead of the texImage2D call with user media appears just too high for this to be a viable route. I'll have to do some tests with the other read-back methods (drawImage to a 2d canvas followed by getImageData, [which feels like a very high-overhead method but is perhaps specifically optimised] or the Image Capture grabFrame() that should be supported by Chrome). Either way the greyscale conversion will be a tradeoff between the lower performance of WebAssembly without SIMD vs overheads, memory bandwidth and CPU/GPU sync points of just uploading the full res RGB image as an ArrayBuffer from the JS side.
,
Jul 25
...it would also be nice to know if the frame has changed since the last upload to avoid doing pointless work but I can't see any web APIs to access that data.
,
Jul 26
,
Jul 26
Tested this issue on Android and able to reproduce this issue. Steps Followed: 1. Launched chrome, navigated to https://tango-bravo.net/WebGLVideoPerformance/perftest.html 2. Clicked on play and observed upload time as 0.57(max) 3. Now clicked on getUserMedia, clicked allow and observed upload time as 4.05(max) Observations: ============== In mac seeing 0.3 to 0.5 on selecting .mp4/getusermedia -- no difference observed Chrome versions tested: 60.0.3072.0, 67.0.3396.87(stable) ; 70.0.3501.0(canary) OS: Android 9.0 Android Devices: Pixel 2 XL This seems to be a Non-Regression issue as same behavior is seen from M-60 builds. Leaving the issue as Untriaged for further input's on this issue. Please navigate to below link for log's -- go/chrome-androidlogs/867368 Thanks!
,
Jul 26
Kai, could you build on Android and see which code path is being taken for texImage2D in this case?
,
Jul 26
Nexus 6P, 8.1.0, Canary 70.0.3503.0: 2.7ms -> 10ms Pixel 1, 9.0.0 (PPP5), Canary 80.0.3503.0: 2.2ms -> 4.2ms It's interesting that the difference is so large on 6P and Pixel 2 but not quite as huge on Pixel 1. But regardless, it seems to reproduce on both of these.
,
Jul 27
Oh yeah, forgot to say: Re: comment #2, you can star issue 639174. If you want to try out what we've prototyped there (warning: it won't ship as is!) you can enable chrome://flags/#enable-experimental-canvas-features and you should be able to access the last-uploaded frame's metadata via these properties of the WebGLTexture object: https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/renderer/modules/webgl/webgl_texture.idl If you have any feedback on whether that works or doesn't work for you, please comment on that issue.
,
Jul 27
Sorry, that's not the right flag. You need to start Chrome with --enable-blink-features=ExtraWebGLVideoTextureMetadata
,
Jul 27
Thanks for the notes on avoiding re-processing; have starred that issue and added a couple of comments there. I've taken a quick look at Chrome's android camera implementations (assuming I've found the right place?): https://chromium.googlesource.com/chromium/src.git/+/master/media/capture/video/android/java/src/org/chromium/media I notice you're using the CPU callbacks for both the legacy camera and the Camera2 APIs. Is that due to the multi-process architecture in Chrome that means it's not possible to use a SurfaceTexture there? Looks like there's some CPU-side colour conversion too which is going to add some overhead. Is this the same with video files or do they manage somehow to end up as SurfaceTextures throughout?
,
Jul 27
Thanks for the pointer, I haven't investigated that area yet. What I have determined so far is basically summarized by the attached screenshot. It looks like time is going into a CPU-side color/format-conversion (I420 to RGB) and subsequent upload to the GPU. Indeed, I don't know why the data starts out on the CPU, but it seems you've found some insight on that.
,
Jul 27
,
Jul 31
What appears to be happening here is that the I420 video frames come in on the browser process, and are shared with the renderer process via shared memory. The renderer then does a CPU I420-to-RGB decode, uploads the result to a texture, and then does a GPU-GPU copy from that texture to the target WebGL texture. The first possible solution is to upload the YUV planes into GPU resources first, and then do a GPU decode. Optimally: - The data is uploaded directly from shared memory in the GPU process, avoiding an extra copy into a transfer buffer. - The GPU decode is done directly into the WebGL texture. The second solution would be to have the video frames start out as textures rather than CPU memory. As #9 pointed out, it should be possible to get video frames as GL textures directly from Camera2. However this could be a big project since the camera would have to be accessed from the GPU process instead of the browser process. AFAIK this is how However, I do think this would be the ideal solution. +chfremer, mcasas Do you have any input on this? I don't know for sure that something isn't going wrong - maybe we're not supposed to be hitting the shared memory path. Would it be feasible to get camera frames as textures in the GPU process? Would be happy to VC to better understand the situation here - let me know.
,
Jul 31
Adding support for capturing directly into textures is definitely something we are interested in. And you are right, this would probably a bigger effort. A question you already raised is if the camera would have to be accessed from the GPU process or if we could get away with allocating textures from the GPU process and asking the Camera2 API to fill them for us (from a different process). Another issue that needs to be addressed is that the Chromium video capture stack does not currently have signals or logic for deciding if we want to capture into CPU memory, GpuMemoryBuffer, or GL textures.
,
Jul 31
FYI, the latest trace:
,
Jul 31
Thanks Christian, sounds like it would be a pretty big project. I think I'll keep looking into the first solution - I think it should be pretty tractable (I have most of a prototype already - but don't know if it works).
,
Aug 1
A GPU conversion path for texImage2d from I420 CPU data does make sense and should offer a significant win. I feel the SurfaceTexture route is a better long term solution though, and it seems like some of the plumbing is already in place for hardware video decode. I was doing quite a bit of reading around Chrome architecture docs and source at the weekend. See for example this page: https://www.chromium.org/developers/design-documents/video-playback-and-compositor There's a content::StreamTextureProxyImpl on Android that appears to plumb SurfaceTexture between the required layers. I also read that on android there isn't a separate GPU process, but the threads for the GPU process are run in the Browser process. Not sure where the capture stuff runs though? #12 mentioned Camera2 can use a SurfaceTexture for output, but just for clarity this is also possible in the old camera API, and you're already setting a "dummy" one (either a SurfaceTexture or SurfaceView is required for the old API): https://chromium.googlesource.com/chromium/src.git/+/master/media/capture/video/android/java/src/org/chromium/media/VideoCaptureCamera.java#389 One other advantage of SurfaceTexture output with the old camera API - you can also get timestamps, which are not accessible from the CPU callback route. I'd expect the path to WebGL to be significantly faster if the camera frames are directly sent to SurfaceTexture. Would be interesting to see a trace of the texImage2d from the file source, just to see the difference. In terms of other code paths, I suspect the compositor and Accelerated 2D canvas would both prefer the data in a SurfaceTexture. I don't know the WebRTC spec well so not sure what is required there - but if Android hardware encoding of the video is used, there is definitely an interface to pump data in via a SurfaceTexture there too which should avoid any need for CPU read-back of the pixels. The only read-back I can think of where CPU data is definitely required is grabFrame() from the MediaStream Image Capture spec. Even then YUV -> RGB is still required, so CPU data combined with CPU color conversion won't necessarily beat SurfaceTexture data, GPU color conversion, and read-back from the GPU.
,
Aug 1
Just to chime in, in all likelihood Android captures video on some sort of GPU-side buffer, that we then proceed to download to CPU [1]. That's not good for performance. This frame is then copied again [2] (but not a third time, phew [3]). So we get a GPU-friendly buffers capture, which we then proceed to download and copy to Shared Memory, only to then reupload to GPU for display (usually we encode it as well). simon@: please keep in mind also crbug.com/793301 - migrating to use SurfaceLayer ISO VideoLayer. The bug does not apply to MediaStreams (live feeds) but a similar concept will be applied to WebMediaPlayerMS soon afterwards. [1] https://cs.chromium.org/chromium/src/media/capture/video/android/java/src/org/chromium/media/VideoCaptureCamera2.java?q=videocapturecamera2.java&sq=package:chromium&dr&l=158 [2] https://cs.chromium.org/chromium/src/media/capture/video/android/video_capture_device_android.cc?type=cs&q=libyuv+file:%5Esrc/media/capture/video/+package:%5Echromium$&g=0&l=331 [3] https://cs.chromium.org/chromium/src/media/capture/video/video_capture_device_client.cc?type=cs&q=libyuv+file:%5Esrc/media/capture/video/+package:%5Echromium$&g=0&l=293
,
Aug 1
Even without the major improvement of keeping camera data on the GPU, we get a huge win here by moving the YUV decode to the GPU, even with the extraneous copies. It would seem that the slow code path here (CanvasResourceProvider+PaintCurrentFrame+StaticBitmapImage::CopyToTexture), which was supposed to accelerate some video upload cases, may no longer be serving its purpose. It was added in https://crrev.com/565743003 , before the CopyVideoTextureToPlatformTexture path existed at all. I am pretty sure that nowadays, the CanvasResourceProvider path is only ever hitting the non-accelerated case, and therefore isn't providing any benefit over the last fallback case (VideoFrameToImage+TexImageImpl, which is actually a CanvasResourceProvider+PaintCurrentFrame+WebGLRenderingContextBase::TexImageImpl). I'm looking into whether we can remove that path, which would be nice, and add something like the one I prototyped (which does work, by the way).
,
Aug 6
chfremer/mcasas: I have a WIP CL here: https://chromium-review.googlesource.com/c/chromium/src/+/1161606 However I'm still struggling to figure out how to test it. Do you have pixel tests, layout tests, or anything else which are able to exercise this kind of media stream case (where video frames come in STORAGE_SHMEM)?
,
Aug 6
,
Aug 6
+ emircan@ I don't know what test coverage there is for webmediaplayer_ms.
,
Aug 6
#19: any local video capture and playback would exercise the WMPMS using the ShMem storage (because it's the default, unless in code paths/platforms where ermican@ might have connected the GpuMemoryBufferVideoFramePool). LayoutTests probably don't exercise this Renderer code, but content_browsertests and/or browser_tests starting with WebRtcGetUserMedia should do it (they draw whatever is produced by a FakeVideoCaptureDevice, you should be able to repro those running Chrome with --use-fake-device-for-media-stream --use-fake-ui-for-media-stream (the second one is to avoid asking for permission) e.g. with the site https://webrtc.github.io/samples/src/content/getusermedia/gum/
,
Aug 6
mcasas@ thanks for the pointers. Can we ask for some more pointers to what those flags can do? It looks like --use-fake-device-for-media-stream is mainly used with browser tests which mock out the video capture device like https://cs.chromium.org/chromium/src/content/browser/renderer_host/media/video_capture_browsertest.cc . Is it possible (for example) to give the browser a video file and tell it to treat it as though it were input from a camera? Or what else can --use-fake-device-for-media-stream do?
,
Aug 7
--use-fake-device-for-media-stream replaces any webcams or capture devices in the system with a Chromium one that looks like a rolling pacman with a timer (and it also produces a beep every second). --use-file-for-fake-video-capture=bla.y4m can be used to replace the system capture devices with a file (that plays in a loop, forever). The file format accepted is a subset of the Y4M container [0] , which is essentially uncompressed video frames in I420 format. Any of the 420 files in e.g. [1] should work (but not the 422 nor 444). Fake or file, however, the browser converts any incoming data to I420 triplanar (or fully planar, depending on your terminology) before sending the VideoFrames to the Renderer, so that's all the latter sees anyway. (*) (*) Not entirely true: depth capture produces and sends Y16, but it's a smaller use case and not supported by the file-... ; there's a way to configure the --use-fake... for producing depth-like frames, let me know if you guys want that path. [0] https://wiki.multimedia.cx/index.php?title=YUV4MPEG2 [1] https://media.xiph.org/video/derf/y4m/
,
Aug 9
,
Aug 10
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1236a1623e8f1eb638e33d3d145739fe1ca081d4 commit 1236a1623e8f1eb638e33d3d145739fe1ca081d4 Author: Kai Ninomiya <kainino@chromium.org> Date: Fri Aug 10 20:50:39 2018 Remove qualcomm from ToughWebglPage skipped_gpus Historically, we haven't been running these WebGL perf tests on our Android perf bots (which are all Qualcomm devices). This change should hopefully allow us to start tracking WebGL perf on mobile. Bug: 867368 Change-Id: I8e409d649f6238094928dfdac4cc6f7d2c444ca5 Reviewed-on: https://chromium-review.googlesource.com/1170032 Reviewed-by: Kenneth Russell <kbr@chromium.org> Reviewed-by: Ned Nguyen <nednguyen@google.com> Commit-Queue: Kai Ninomiya <kainino@chromium.org> Cr-Commit-Position: refs/heads/master@{#582326} [modify] https://crrev.com/1236a1623e8f1eb638e33d3d145739fe1ca081d4/tools/perf/page_sets/rendering/tough_webgl_cases.py
,
Aug 11
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/074cecb79d9d30ec9f2bf6b5b10edb6cd77a8b4f commit 074cecb79d9d30ec9f2bf6b5b10edb6cd77a8b4f Author: Kai Ninomiya <kainino@chromium.org> Date: Sat Aug 11 01:26:13 2018 Normalize rendering_desktop.json, rendering_mobile.json This is needed because re-recording archives (using record_wpr) causes these files to get re-alphabetized automatically. This commit lets us re-record an archive without producing an unreadable diff. Doing this also removes the duplicate key "androidpolice_mobile_sync_scroll_2018" from rendering_mobile.json. Tangentially, also delete the unused archive rendering_mobile_007.wprgo.sha1. Bug: 872551 , 867368 Change-Id: Ib15c9c0a8600e7fa510a146ad896aa92e4f9de45 Reviewed-on: https://chromium-review.googlesource.com/1171677 Reviewed-by: Ned Nguyen <nednguyen@google.com> Commit-Queue: Kai Ninomiya <kainino@chromium.org> Cr-Commit-Position: refs/heads/master@{#582414} [modify] https://crrev.com/074cecb79d9d30ec9f2bf6b5b10edb6cd77a8b4f/tools/perf/page_sets/data/rendering_desktop.json [modify] https://crrev.com/074cecb79d9d30ec9f2bf6b5b10edb6cd77a8b4f/tools/perf/page_sets/data/rendering_mobile.json [delete] https://crrev.com/6fec3aebc5358345210d473c0ec36fef86ba5937/tools/perf/page_sets/data/rendering_mobile_007.wprgo.sha1
,
Aug 13
It looks like the aquarium benchmark is running on the Nexus 5X perf bot. See: https://ci.chromium.org/buildbot/chromium.perf/android-nexus5x-perf/199 and in particular shard #1: https://chrome-swarming.appspot.com/task?id=3f4d008f7a799710&refresh=10&request_detail=true&show_raw=1 [ RUN ] rendering.mobile/aquarium ... [ OK ] rendering.mobile/aquarium (38475 ms) (INFO) 2018-08-13 17:13:26,608 cloud_storage.Insert:383 Uploading /b/swarming/w/itTAKiK3/tmpZJ4_6J.html to gs://chrome-telemetry-output/aquarium_2018-08-13_17-08-35_86413.html View generated trace files online at https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/aquarium_2018-08-13_17-08-35_86413.html for story aquarium ... Uploading logs of page aquarium to https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/39dc0dfa-9f1c-11e8-b390-0242ac110004 (1 out of 1) but later in the log: (CRITICAL) 2018-08-13 17:16:38,755 story_runner.RunBenchmark:377 Benchmark execution interrupted by a fatal exception: <class 'devil.android.device_errors.CommandTimeoutError'>(Timed out waiting for 1 of 1 threads.) [ RUN ] rendering.mobile/aquarium ===== SKIPPING TEST aquarium: Telemetry interrupted ===== [ SKIPPED ] rendering.mobile/aquarium (0 ms) I don't know how to read these logs. Is this benchmark running or not? Speed team, can you please help?
,
Aug 13
Note that there was some sort of catastrophic failure which caused a bunch of the benchmarks to fail to upload results. Here's the log excerpt from the point where the aquarium benchmark successfully ran, to the point where the harness claimed that it was skipped.
,
Aug 14
Ken, the benchmark was run succesfully (see the "json.output" link in https://ci.chromium.org/buildbot/chromium.perf/android-nexus5x-perf/199) To double check the result of the aquarium test, I can find "aquarium" in "rendering.mobile": {"perf_results"} entry in "Results DashboardUpload Failure ..." link. https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2Fandroid-nexus5x-perf%2F199%2F%2B%2Frendering.mobile Both the "Results DashboardUpload.." and "Merge script log" links confirmed that we failed to upload rendering.mobile. The "Merge script log" link also showed that the failure is due to 500 error, which is tracked in issue 867379 I also look at the SKIPPING TEST issue, it was a known problem and only "aquarium_20k" was skipped, not "aquarium" test.: ===== SKIPPING TEST aquarium_20k: crbug.com/850295 ===== I cannot find "===== SKIPPING TEST aquarium: Telemetry interrupted =====", it was probably in other build run?
,
Aug 14
Also the uploading error is flaky, so we do have some data recently for aquarium test: https://chromeperf.appspot.com/report?sid=73fedbd4ee79925016f3988ce8f7118579d82aa48d434dd45efe96f45858dc4e
,
Aug 17
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8ae85144bca45f0f0897d3deb3fc998e360e439b commit 8ae85144bca45f0f0897d3deb3fc998e360e439b Author: Kai Ninomiya <kainino@chromium.org> Date: Fri Aug 17 04:22:10 2018 Add camera_to_webgl perf test to ToughWebglCases This adds a new perf test case for camera-to-WebGL uploads. This new case is intended to detect the performance improvements being made in issue 867368 . Bug: 867368 Change-Id: Ia8b05f422e15bb7625491ac10f33903dcc1c55d0 Reviewed-on: https://chromium-review.googlesource.com/1170033 Reviewed-by: Sadrul Chowdhury <sadrul@chromium.org> Reviewed-by: Ned Nguyen <nednguyen@google.com> Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Kai Ninomiya <kainino@chromium.org> Cr-Commit-Position: refs/heads/master@{#583957} [modify] https://crrev.com/8ae85144bca45f0f0897d3deb3fc998e360e439b/tools/perf/page_sets/data/rendering_desktop.json [delete] https://crrev.com/10c0c8c01c8eea683478dabc3e3308edc3f698c4/tools/perf/page_sets/data/rendering_desktop_006.wprgo.sha1 [add] https://crrev.com/8ae85144bca45f0f0897d3deb3fc998e360e439b/tools/perf/page_sets/data/rendering_desktop_011.wprgo.sha1 [modify] https://crrev.com/8ae85144bca45f0f0897d3deb3fc998e360e439b/tools/perf/page_sets/data/rendering_mobile.json [delete] https://crrev.com/10c0c8c01c8eea683478dabc3e3308edc3f698c4/tools/perf/page_sets/data/rendering_mobile_006.wprgo.sha1 [add] https://crrev.com/8ae85144bca45f0f0897d3deb3fc998e360e439b/tools/perf/page_sets/data/rendering_mobile_026.wprgo.sha1 [modify] https://crrev.com/8ae85144bca45f0f0897d3deb3fc998e360e439b/tools/perf/page_sets/rendering/rendering_stories.py [modify] https://crrev.com/8ae85144bca45f0f0897d3deb3fc998e360e439b/tools/perf/page_sets/rendering/story_tags.py [modify] https://crrev.com/8ae85144bca45f0f0897d3deb3fc998e360e439b/tools/perf/page_sets/rendering/tough_webgl_cases.py
,
Aug 18
,
Aug 20
It looks like camera_to_webgl is running successfully on the android-nexus5x-perf bot: https://chromeperf.appspot.com/report?sid=4149e5ad04d39b7fd1199445a4b80079d147071766b857ea30b776813701d693
,
Aug 20
The data for avg_surface_fps doesn't seem right (stuck at 1.0), but the frame_times_avg that Ned linked at some point looks good: https://chromeperf.appspot.com/report?sid=527d34bd1ecfaf4012e23f19fc276f8d96ba573794bf7ba6b71d6ec8f820b391 This is probably the graph I'll be watching after I land crrev.com/c/1161606 .
,
Aug 21
,
Aug 21
,
Aug 22
,
Aug 23
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/321904d3d1c4675cfe9d25b1c30f528efcf869e4 commit 321904d3d1c4675cfe9d25b1c30f528efcf869e4 Author: Kai Ninomiya <kainino@chromium.org> Date: Thu Aug 23 00:05:49 2018 Add optimized path for YUV-to-WebGL, remove old path For camera-to-WebGL on Nexus 6P at 720p, this improves blocking texImage2D time from ~12ms to ~4ms (200% speedup). On some other devices and resolutions, I think there can be up to a ~10x speedup. * Adds an optimized upload path for CPU-side YUV video frames (e.g. those coming from a video camera on Android). This path uploads the individual Y/U/V textures to the GPU, performs a GPU YUV-RGB decode, and copies the result into the WebGL texture. This code path could potentially be further optimized in 2 ways: * Avoid the extra copy of the CPU-side YUV data from browser-renderer shared memory (VideoFrame::STORAGE_SHMEM) to renderer-gpu shared memory (transfer buffer, probably). * Avoid the extra copy from the decoded image (SkImage) into the WebGL texture, and instead decode directly into the WebGL texture. * Removes an old GPU-GPU path that was obsoleted by CopyVideoTextureToPlatformTexture. This obsolete path was only handling CPU-GPU uploads instead of GPU-GPU uploads, and it was doing so by performing an expensive YUV-RGB conversion on the CPU. This also allowed some cleanup in TexImageHelperHTMLVideoElement. Bug: 867368 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel Change-Id: Id25d5dbfc76ec8f9dc606890588a20978f6943f6 Reviewed-on: https://chromium-review.googlesource.com/1161606 Commit-Queue: Kai Ninomiya <kainino@chromium.org> Reviewed-by: Mounir Lamouri <mlamouri@chromium.org> Reviewed-by: Kenneth Russell <kbr@chromium.org> Reviewed-by: Dan Sanders <sandersd@chromium.org> Cr-Commit-Position: refs/heads/master@{#585322} [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/content/renderer/media/stream/webmediaplayer_ms.cc [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/content/renderer/media/stream/webmediaplayer_ms.h [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/media/renderers/paint_canvas_video_renderer.cc [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/media/renderers/paint_canvas_video_renderer.h [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/third_party/blink/public/platform/web_media_player.h [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/third_party/blink/renderer/core/html/media/html_video_element.cc [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/third_party/blink/renderer/core/html/media/html_video_element.h [modify] https://crrev.com/321904d3d1c4675cfe9d25b1c30f528efcf869e4/third_party/blink/renderer/modules/webgl/webgl_rendering_context_base.cc
,
Aug 23
Fix is in - I'll check tomorrow to see if it has shown up in the perf results yet.
,
Aug 23
The NextAction date has arrived: 2018-08-23
,
Aug 23
📉👍
,
Aug 24
Excellent work Kai! The performance improvement on your new test is superb!
,
Aug 24
Thanks for all the work on this, sounds like a big step forward. Is there a way to opt-in to canary builds on android so I can try this out on my test case? Shifting the YUV conversion to the GPU will definitely free up the main javascript thread sooner (which is the most important thing for CPU-intensive applications such as mine). The downside is I imagine it becomes significantly harder to measure the overall overhead as some of it will now be in the GPU process. I've been doing a lot of detailed systrace investigation of the android camera pipeline and noticed that the CPU callbacks are associated with pretty high overhead within the system-level "camera server" process (at least on the Galaxy S8). Even though I need greyscale data on the CPU in my native app I'm getting significantly better overall performance using the SurfaceTexture camera interface and doing the RGB -> greyscale conversion in a shader followed by a glReadPixels. I'd suspect many Chromium use cases don't actually require the data on the CPU side at all, so a SurfaceTexture-throughout pipeline will almost certainly give the lowest overhead overall and would be worth pursuing IMHO. Finally I just want to thank you all for the openness and responsiveness here; it's much appreciated and such a massive difference from how Apple treats bug reports :)
,
Aug 24
simon@ Chrome Canary is directly available in play store :-) https://play.google.com/store/apps/details?id=com.chrome.canary&hl=en_US Re. how capture works, you're probably right that, in an ideal world, we should capture video in a SurfaceTexture (generally in an abstract handle representing a platform-dependent thing, e.g. IOSurface or Dma-Buf) and just move that one around; for the main use cases of playback (seeing the cam feed in a <video> or rendering it in a webgl canvas) or encoding, this should work just fine, right? Since we use Android APIs and those are compatible with SurfaceTextures. The issue is manifold here: historically, the only capture scenario was WebRTC, and that one used only software encoding, hence: the captured pixels were readback and, for good measure, converted to a single transport pixel format, I420. Some time after, we started using platform encoders, which would be happy to take SurfaceTextures, but WebRTC likes to encode _several_ times the same feed in different resolutions >_< -- and, let's assume only one platform encoder can be used at once, the others would still be sw encoders and for this, you guess, the pixels still need to be in the CPU. Even if we had only one encoding of one resolution, WebRTC still likes to have the chance of switching to sw encoding to tweak parameters and as a fallback in case the platform encoder bursts in flames, hence still we need the pixels on the CPU side. Argh! I'm not saying this is ideal, far from it and, as matter of fact, in other areas we use GpuMemoryBuffers which are precisely wrappers around those platform abstractions to some extent, but changing the capture code would need to file bugs and write code.
,
Dec 7
,
Dec 7
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by si...@zappar.com
, Jul 25