GL_COMMANDS_COMPLETED queries are slow on Windows |
||
Issue descriptionOne copy raster uses GL_COMMANDS_COMPLETED queries to know when it can reuse staging buffers. See attached ETW trace CPU profile screenshots for pinch-zoom on cnn.com on SurfaceBook with Intel HD 620. HandleEndQueryEXT takes up most of the CPU time for GPU main thread, 715ms out of 1143ms for DoCommands and 5s wall time, which is approx 3x the time for the next item, HandleCopySubTextureCHROMIUM. That time is split up ~50-50 between the glFlush (295ms), and creating the sync object (366ms). piman@ suggested experimenting with GL_COMMANDS_ISSUED which should be sufficient for shm staging buffers. See attached trace screenshot for that. Total CPU time of DoCommands is 715ms for 9s wall time, and HandleEndQueryEXT doesn't take up any time. If this is worthwhile, we can switch on staging buffer type, and decide whether to use GL_COMMANDS_COMPLETED or GL_COMMANDS_ISSUED. wdyt reveman@? geofflang@ let me know if you need more detailed stacks for ANGLE.
,
Apr 9 2018
Having a stack trace into ANGLE would be great but I suspect it will mostly be the cost of creating the D3D query objects and then polling them. The command decoders both poll pending queries when receiving certain commands so maybe we could be smarter about scheduling the polling.
,
Apr 9 2018
I think we should be switching to GpuFence framework for improved performance for synchronizing access to GMBs [1],[2]. That is what that framework has been built for. [1] https://cs.chromium.org/chromium/src/ui/gfx/gpu_fence.h [2] https://chromium.googlesource.com/chromium/src/+/master/docs/design/gpu_synchronization.md In an earlier prototype of this framework, I implemented a simple cross platform GpuFence based on shmem. This not only solved the COMMANDS_COMPLETED performance problem but also improved over COMMANDS_ISSUES significantly, as no context switches were needed for the client to detect that a GMB can be reused.
,
Aug 25
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/13a440c5640000
,
Aug 25
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/13a440c5640000
,
Aug 27
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/12982351640000
,
Aug 28
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/12982351640000
,
Aug 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/370c8bc2082a3665ccc45fef39955fd550170b61 commit 370c8bc2082a3665ccc45fef39955fd550170b61 Author: Sunny Sachanandani <sunnyps@chromium.org> Date: Thu Aug 30 19:18:53 2018 Use GL_COMMANDS_ISSUED_CHROMIUM query for shared memory GpuMemoryBuffers GL_COMMANDS_ISSUED_CHROMIUM is sufficient for shared memory GpuMemoryBuffers because they're uploaded using glTexImage2D (see GLImageMemory::CopyTexImage). This should reduce the number of idle wakeups in the GPU process, and reduce the CPU time spent on such queries which is quite high on Windows. Bug: 830084, 622491 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: I8111b3d7f12abbf900383d45e2a74394848e83fd Reviewed-on: https://chromium-review.googlesource.com/1189126 Commit-Queue: Sunny Sachanandani <sunnyps@chromium.org> Reviewed-by: Eric Karl <ericrk@chromium.org> Cr-Commit-Position: refs/heads/master@{#587695} [modify] https://crrev.com/370c8bc2082a3665ccc45fef39955fd550170b61/cc/raster/one_copy_raster_buffer_provider.cc
,
Sep 1
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6eb9eee331affe92c52a2926f57422bcbf40fa43 commit 6eb9eee331affe92c52a2926f57422bcbf40fa43 Author: Eric Karl <ericrk@chromium.org> Date: Sat Sep 01 06:52:21 2018 Revert "Use GL_COMMANDS_ISSUED_CHROMIUM query for shared memory GpuMemoryBuffers" This reverts commit 370c8bc2082a3665ccc45fef39955fd550170b61. Reason for revert: This is breaking desktop sites (SW raster) on Android when OOP-R is enabled. see crbug.com/879823 Original change's description: > Use GL_COMMANDS_ISSUED_CHROMIUM query for shared memory GpuMemoryBuffers > > GL_COMMANDS_ISSUED_CHROMIUM is sufficient for shared memory > GpuMemoryBuffers because they're uploaded using glTexImage2D (see > GLImageMemory::CopyTexImage). > > This should reduce the number of idle wakeups in the GPU process, and > reduce the CPU time spent on such queries which is quite high on > Windows. > > Bug: 830084, 622491 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: I8111b3d7f12abbf900383d45e2a74394848e83fd > Reviewed-on: https://chromium-review.googlesource.com/1189126 > Commit-Queue: Sunny Sachanandani <sunnyps@chromium.org> > Reviewed-by: Eric Karl <ericrk@chromium.org> > Cr-Commit-Position: refs/heads/master@{#587695} TBR=sunnyps@chromium.org,ericrk@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 830084, 622491, 879823 Change-Id: I7c78d2cc92e9ad274e8f443969d2a72d2410d5db Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Reviewed-on: https://chromium-review.googlesource.com/1199629 Reviewed-by: Eric Karl <ericrk@chromium.org> Commit-Queue: Eric Karl <ericrk@chromium.org> Cr-Commit-Position: refs/heads/master@{#588274} [modify] https://crrev.com/6eb9eee331affe92c52a2926f57422bcbf40fa43/cc/raster/one_copy_raster_buffer_provider.cc
,
Sep 5
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5646424ddff0c473c68f8dac3dacc7036bc119bc commit 5646424ddff0c473c68f8dac3dacc7036bc119bc Author: Eric Karl <ericrk@chromium.org> Date: Tue Sep 04 22:22:22 2018 [merge to M70] Revert "Use GL_COMMANDS_ISSUED_CHROMIUM query for shared memory GpuMemoryBuffers" This reverts commit 370c8bc2082a3665ccc45fef39955fd550170b61. Reason for revert: This is breaking desktop sites (SW raster) on Android when OOP-R is enabled. see crbug.com/879823 Original change's description: > Use GL_COMMANDS_ISSUED_CHROMIUM query for shared memory GpuMemoryBuffers > > GL_COMMANDS_ISSUED_CHROMIUM is sufficient for shared memory > GpuMemoryBuffers because they're uploaded using glTexImage2D (see > GLImageMemory::CopyTexImage). > > This should reduce the number of idle wakeups in the GPU process, and > reduce the CPU time spent on such queries which is quite high on > Windows. > > Bug: 830084, 622491 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: I8111b3d7f12abbf900383d45e2a74394848e83fd > Reviewed-on: https://chromium-review.googlesource.com/1189126 > Commit-Queue: Sunny Sachanandani <sunnyps@chromium.org> > Reviewed-by: Eric Karl <ericrk@chromium.org> > Cr-Commit-Position: refs/heads/master@{#587695} TBR=sunnyps@chromium.org,ericrk@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 830084, 622491, 879823, 879937 Change-Id: I7c78d2cc92e9ad274e8f443969d2a72d2410d5db Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Reviewed-on: https://chromium-review.googlesource.com/1199629 Reviewed-by: Eric Karl <ericrk@chromium.org> Commit-Queue: Eric Karl <ericrk@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#588274}(cherry picked from commit 6eb9eee331affe92c52a2926f57422bcbf40fa43) Reviewed-on: https://chromium-review.googlesource.com/1204114 Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org> Cr-Commit-Position: refs/branch-heads/3538@{#36} Cr-Branched-From: 79f7c91a2b2a2932cd447fa6f865cb6662fa8fa6-refs/heads/master@{#587811} [modify] https://crrev.com/5646424ddff0c473c68f8dac3dacc7036bc119bc/cc/raster/one_copy_raster_buffer_provider.cc
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/80d0a681332ba4c48d10e11f17d833eea89758ce commit 80d0a681332ba4c48d10e11f17d833eea89758ce Author: Sunny Sachanandani <sunnyps@chromium.org> Date: Thu Sep 13 18:57:47 2018 Reland "Use GL_COMMANDS_ISSUED_CHROMIUM query for shared memory GpuMemoryBuffers" This is a reland of 370c8bc2082a3665ccc45fef39955fd550170b61 This is similar to the original change, except in two significant ways: 1. RasterImplementation now supports COMMANDS_ISSUED queries, and 2. The decision whether to use queries is handled in a single place by OneCopyRasterBufferProvider, and communicated to StagingBufferPool by setting |query_id| on the StagingBuffer. Original change's description: > Use GL_COMMANDS_ISSUED_CHROMIUM query for shared memory GpuMemoryBuffers > > GL_COMMANDS_ISSUED_CHROMIUM is sufficient for shared memory > GpuMemoryBuffers because they're uploaded using glTexImage2D (see > GLImageMemory::CopyTexImage). > > This should reduce the number of idle wakeups in the GPU process, and > reduce the CPU time spent on such queries which is quite high on > Windows. > > Bug: 830084, 622491 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: I8111b3d7f12abbf900383d45e2a74394848e83fd > Reviewed-on: https://chromium-review.googlesource.com/1189126 > Commit-Queue: Sunny Sachanandani <sunnyps@chromium.org> > Reviewed-by: Eric Karl <ericrk@chromium.org> > Cr-Commit-Position: refs/heads/master@{#587695} Bug: 830084, 622491 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: Ia8bec687f42fef8242689fb014c1d18d2194b1e3 Reviewed-on: https://chromium-review.googlesource.com/1205990 Commit-Queue: Sunny Sachanandani <sunnyps@chromium.org> Reviewed-by: Jonathan Backer <backer@chromium.org> Cr-Commit-Position: refs/heads/master@{#591097} [modify] https://crrev.com/80d0a681332ba4c48d10e11f17d833eea89758ce/cc/DEPS [modify] https://crrev.com/80d0a681332ba4c48d10e11f17d833eea89758ce/cc/raster/one_copy_raster_buffer_provider.cc [modify] https://crrev.com/80d0a681332ba4c48d10e11f17d833eea89758ce/cc/raster/staging_buffer_pool.cc [modify] https://crrev.com/80d0a681332ba4c48d10e11f17d833eea89758ce/cc/raster/staging_buffer_pool.h [modify] https://crrev.com/80d0a681332ba4c48d10e11f17d833eea89758ce/gpu/command_buffer/client/raster_implementation.cc |
||
►
Sign in to add a comment |
||
Comment 1 by sunn...@chromium.org
, Apr 6 2018682 KB
682 KB View Download
541 KB
541 KB View Download