OOP Raster needs better back pressure for raster work. |
||||||||||
Issue descriptionOpen http://jsfiddle.net/xLuvC/1/show/ with Gpu raster and OOP. Notice that the frame rate is much worse in OOP that Gpu. This is because with a lot of the cpu bound work moving from raster to Gpu_main thread in OOP, the renderer side is too fast in submitting raster work and ends up scheduling a lot more prepaint tiles. This extra work eventually blocks display frames later in the pipeline, which keeps the renderer from submitting more frames. Also more cpu_time_per_frame from the extra raster work. Even if the raster work moves to a worker thread in the Gpu process, this back pressure is still necessary.
,
Feb 26 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/10ccfb20440000
,
Feb 27 2018
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/15cf2440440000
,
Feb 27 2018
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/10ccfb20440000
,
Mar 23 2018
Writing down some hallway discussions for posterity:
It seems like the cmd::Token system for knowing when something has been processed out of a single command buffer is a good mechanism for knowing when the gpu service has finished processing a task.
One proposal is that for any task that is speculative (not required for draw or activation) to add additional dependent tasks. These dependent tasks would process some queue of tokens like:
if max number tasks reached:
wait on first inserted token
pop that token off the queue
insert new token in the queue
For example if max=2, then task #3 would wait on the token from task #1 before starting.
These should ideally be separate tasks. There's no way to cancel tasks in progress, and so if it is part of the raster task itself then we'd still have to execute the raster task once the waiting happened.
,
Mar 23 2018
I think this use case may fit with the way we wait for tiles to complete on the gpu service for scheduling. Each tile has a sync token, and ContextSupport::IsSyncTokenSignaled() can be used to poll, ContextSupport::SignalSyncToken() to install a callback on completion. See GpuRasterBufferProvider::IsResourceReadyToDraw() / GpuRasterBufferProvider::SetReadyToDrawCallback().
,
Mar 23 2018
Yeah, Khushal made the exact same point too, that sync token signalling is a much better async mechanism here to add the back pressure. I think that makes a lot of sense, thanks.
,
Mar 28 2018
,
Mar 30 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/10257627440000
,
Mar 30 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/1347b233440000
,
Mar 30 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/1494cfb7440000
,
Mar 30 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/12b9312f440000
,
Mar 30 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14adb88f440000
,
Mar 30 2018
Hmm, local investigation on an n4 shows that prepaint work causes extra thread times in the case listed above, but that this prepaint work does not appear to affect the throughput. It does seem that oop raster just *does* more work in this case, no matter whether prepaint is allowed or not, so maybe there's something else to look into the micro here of whether serialization/deserialization overhead is heavy in this case. Results for posterity: https://docs.google.com/spreadsheets/d/1x--pLNJoU6wMTEPMm0n-pJ1k9nms8_L4h1-dpLt28Zs/edit#gid=256482337 I sent out some pinpoint jobs for (oop|gpu) on (smooth|key silk) thread times with max outstanding tasks = 1 to see what the results are. This is with this patch: https://chromium-review.googlesource.com/c/chromium/src/+/988632
,
Mar 31 2018
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/10257627440000
,
Mar 31 2018
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/1347b233440000
,
Mar 31 2018
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/1494cfb7440000
,
Mar 31 2018
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/12b9312f440000
,
Mar 31 2018
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/14adb88f440000
,
Apr 3 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/12932a50c40000
,
Apr 3 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/16b25c6f440000
,
Apr 4 2018
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/16b25c6f440000 [wip] allow max outstanding prepaint tasks=1 by enne@chromium.org https://chromium-review.googlesource.com/c/chromium/src/+/988632/4 Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Apr 4 2018
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/12932a50c40000 [wip] allow max outstanding prepaint tasks=1 by enne@chromium.org https://chromium-review.googlesource.com/c/chromium/src/+/988632/4 Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Apr 4 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14bad370c40000
,
Apr 4 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14fc0b70c40000
,
Apr 4 2018
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/14fc0b70c40000 [wip] allow max outstanding prepaint tasks=1 by enne@chromium.org https://chromium-review.googlesource.com/c/chromium/src/+/988632/5 Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Apr 4 2018
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/14bad370c40000 [wip] allow max outstanding prepaint tasks=1 by enne@chromium.org https://chromium-review.googlesource.com/c/chromium/src/+/988632/5 Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Apr 5 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/16d6f970c40000
,
Apr 5 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/11c66528c40000
,
Apr 5 2018
😿 Pinpoint job stopped with an error. https://pinpoint-dot-chromeperf.appspot.com/job/16d6f970c40000
,
Apr 5 2018
,
Apr 5 2018
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/11c66528c40000
,
Apr 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b27727abc409de34bbd452c03fe0916551b813de commit b27727abc409de34bbd452c03fe0916551b813de Author: Adrienne Walker <enne@chromium.org> Date: Thu Apr 12 21:10:02 2018 cc: Add backpressure for speculative raster work This adds a limit per raster type of how many outstanding speculative (i.e. non-required) tasks can be in flight at any given time. Not only will the task graph not schedule more than this limit, but it will also wait for these tasks to have been completed on the gpu service side as well. Bug: 816571 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: Iff2cbc5ecaabafa62db6a06c0e8540156bfbe22d Reviewed-on: https://chromium-review.googlesource.com/998682 Commit-Queue: enne <enne@chromium.org> Reviewed-by: Khushal <khushalsagar@chromium.org> Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org> Cr-Commit-Position: refs/heads/master@{#550360} [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/test/fake_tile_manager.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_draw_info.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_draw_info.h [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_manager.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_manager.h [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_manager_unittest.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/trees/layer_tree_host_impl.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/trees/layer_tree_host_unittest.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/trees/layer_tree_settings.h
,
Apr 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6 commit dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6 Author: enne <enne@chromium.org> Date: Sat Apr 14 02:03:48 2018 Revert "cc: Add backpressure for speculative raster work" This reverts commit b27727abc409de34bbd452c03fe0916551b813de. Reason for revert: maybe causing hangs, definitely causing cut off tiles when rotating. Bug: 832684, 832633 Original change's description: > cc: Add backpressure for speculative raster work > > This adds a limit per raster type of how many outstanding speculative > (i.e. non-required) tasks can be in flight at any given time. Not only > will the task graph not schedule more than this limit, but it will also > wait for these tasks to have been completed on the gpu service side as > well. > > Bug: 816571 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: Iff2cbc5ecaabafa62db6a06c0e8540156bfbe22d > Reviewed-on: https://chromium-review.googlesource.com/998682 > Commit-Queue: enne <enne@chromium.org> > Reviewed-by: Khushal <khushalsagar@chromium.org> > Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org> > Cr-Commit-Position: refs/heads/master@{#550360} TBR=enne@chromium.org,sunnyps@chromium.org,khushalsagar@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 816571 Change-Id: I49b618ea978bd21ac97708a68aeaebee25e7f0b0 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Reviewed-on: https://chromium-review.googlesource.com/1012403 Reviewed-by: enne <enne@chromium.org> Commit-Queue: enne <enne@chromium.org> Cr-Commit-Position: refs/heads/master@{#550857} [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/test/fake_tile_manager.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_draw_info.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_draw_info.h [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_manager.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_manager.h [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_manager_unittest.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/trees/layer_tree_host_impl.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/trees/layer_tree_host_unittest.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/trees/layer_tree_settings.h
,
Apr 16 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8cce509eab48e7427c400c9527fcc409ccaab341 commit 8cce509eab48e7427c400c9527fcc409ccaab341 Author: Adrienne Walker <enne@chromium.org> Date: Mon Apr 16 17:28:58 2018 Revert "cc: Add backpressure for speculative raster work" This reverts commit b27727abc409de34bbd452c03fe0916551b813de. Reason for revert: maybe causing hangs, definitely causing cut off tiles when rotating. Bug: 832684, 832633 Original change's description: > cc: Add backpressure for speculative raster work > > This adds a limit per raster type of how many outstanding speculative > (i.e. non-required) tasks can be in flight at any given time. Not only > will the task graph not schedule more than this limit, but it will also > wait for these tasks to have been completed on the gpu service side as > well. > > Bug: 816571 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: Iff2cbc5ecaabafa62db6a06c0e8540156bfbe22d > Reviewed-on: https://chromium-review.googlesource.com/998682 > Commit-Queue: enne <enne@chromium.org> > Reviewed-by: Khushal <khushalsagar@chromium.org> > Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org> > Cr-Commit-Position: refs/heads/master@{#550360} TBR=enne@chromium.org, khushalsagar@chromium.org, sunnyps@chromium.org (cherry picked from commit dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6) Bug: 816571 Change-Id: I49b618ea978bd21ac97708a68aeaebee25e7f0b0 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Reviewed-on: https://chromium-review.googlesource.com/1012403 Reviewed-by: enne <enne@chromium.org> Commit-Queue: enne <enne@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#550857} Reviewed-on: https://chromium-review.googlesource.com/1012309 Cr-Commit-Position: refs/branch-heads/3396@{#20} Cr-Branched-From: 9ef2aa869bc7bc0c089e255d698cca6e47d6b038-refs/heads/master@{#550428} [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/test/fake_tile_manager.cc [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/tiles/tile_draw_info.cc [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/tiles/tile_draw_info.h [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/tiles/tile_manager.cc [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/tiles/tile_manager.h [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/tiles/tile_manager_unittest.cc [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/trees/layer_tree_host_impl.cc [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/trees/layer_tree_host_unittest.cc [modify] https://crrev.com/8cce509eab48e7427c400c9527fcc409ccaab341/cc/trees/layer_tree_settings.h
,
Apr 17 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b27727abc409de34bbd452c03fe0916551b813de commit b27727abc409de34bbd452c03fe0916551b813de Author: Adrienne Walker <enne@chromium.org> Date: Thu Apr 12 21:10:02 2018 cc: Add backpressure for speculative raster work This adds a limit per raster type of how many outstanding speculative (i.e. non-required) tasks can be in flight at any given time. Not only will the task graph not schedule more than this limit, but it will also wait for these tasks to have been completed on the gpu service side as well. Bug: 816571 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Change-Id: Iff2cbc5ecaabafa62db6a06c0e8540156bfbe22d Reviewed-on: https://chromium-review.googlesource.com/998682 Commit-Queue: enne <enne@chromium.org> Reviewed-by: Khushal <khushalsagar@chromium.org> Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org> Cr-Commit-Position: refs/heads/master@{#550360} [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/test/fake_tile_manager.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_draw_info.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_draw_info.h [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_manager.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_manager.h [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/tiles/tile_manager_unittest.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/trees/layer_tree_host_impl.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/trees/layer_tree_host_unittest.cc [modify] https://crrev.com/b27727abc409de34bbd452c03fe0916551b813de/cc/trees/layer_tree_settings.h
,
Apr 17 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6 commit dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6 Author: enne <enne@chromium.org> Date: Sat Apr 14 02:03:48 2018 Revert "cc: Add backpressure for speculative raster work" This reverts commit b27727abc409de34bbd452c03fe0916551b813de. Reason for revert: maybe causing hangs, definitely causing cut off tiles when rotating. Bug: 832684, 832633 Original change's description: > cc: Add backpressure for speculative raster work > > This adds a limit per raster type of how many outstanding speculative > (i.e. non-required) tasks can be in flight at any given time. Not only > will the task graph not schedule more than this limit, but it will also > wait for these tasks to have been completed on the gpu service side as > well. > > Bug: 816571 > Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel > Change-Id: Iff2cbc5ecaabafa62db6a06c0e8540156bfbe22d > Reviewed-on: https://chromium-review.googlesource.com/998682 > Commit-Queue: enne <enne@chromium.org> > Reviewed-by: Khushal <khushalsagar@chromium.org> > Reviewed-by: Sunny Sachanandani <sunnyps@chromium.org> > Cr-Commit-Position: refs/heads/master@{#550360} TBR=enne@chromium.org,sunnyps@chromium.org,khushalsagar@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 816571 Change-Id: I49b618ea978bd21ac97708a68aeaebee25e7f0b0 Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;master.tryserver.blink:linux_trusty_blink_rel Reviewed-on: https://chromium-review.googlesource.com/1012403 Reviewed-by: enne <enne@chromium.org> Commit-Queue: enne <enne@chromium.org> Cr-Commit-Position: refs/heads/master@{#550857} [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/test/fake_tile_manager.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_draw_info.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_draw_info.h [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_manager.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_manager.h [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/tiles/tile_manager_unittest.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/trees/layer_tree_host_impl.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/trees/layer_tree_host_unittest.cc [modify] https://crrev.com/dcddf51ec4ce5a464f901c1cf8fc8cc86c9f59b6/cc/trees/layer_tree_settings.h
,
Apr 23 2018
After a lot of talking to folks, I'm inclined a little to punt this work until it turns out we really need it. Our options look a little bit like this: (1) Backpressure using sync token callbacks (original patch) IPCs for sync token callbacks appear to add ~0.5ms of io thread work which, even if it's not on the fast path, still seems like quite a bit of work for two IPCs. It caused correctness issues (which have been addressed) but also caused renderer hangs (still unknown, unable to repro). So, has a bit of risk and performance issues. (2) Backpressure using client-side polling Sunny suggested polling prepaint tasks on a once per frame basis to see if they were ready. I think if we are only polling once per frame, then that's equivalent to just submitting max N tasks per frame. vmpstr was worried about cases where we had multiple stacked layers that all needed prepaint and not being able to get to them all. There could definitely be some unknown cases where N is not enough and we'd have to rely on checkerboarding metrics to figure this out. (3) Make tasks cancellable It seems a bit backwards to me to submit work that we then ignore, but if we could somehow cancel speculative work from the gpu side. I think this would be a better fit for a future architecture where we sent entire PaintOpBuffers and the raster command was just a "draw this POB with this rect and scale" and not a pile of serialization. (4) Longer-term, reduce the prepaint window We ***still*** haven't fixed the issue where input can be delivered in the middle of a frame, but once we do that, we could reduce the prepaint window, which would in turn reduce the amount of work that could be submitted. This is also maybe a longer-term option. I'm a little bit inclined to punt on backpressure work for OOP-R until we come across something that really needs it. Running the key silk cases, there were 2/25 cases that saved like ~15% raster time (and everything else was a wash). This seems like a minor win on a small number of corner cases.
,
May 1 2018
vmiura, sunnyps: Some of this backpressure discussion came from previous discussions (e.g. during the cdl prototype). Is there a known bad use case that needs backpressure? I'm trying to understand if this is something we can punt on (and assume our current limitations on work per frame via prepaint windows, etc handle this) or if there's a known use case that we really need to solve prior to shipping oop-r.
,
May 1 2018
Re 39: The original observations were made on the cdl prototype, but much has changed with Ganesh implementation and GPU scheduling landing since. I definitely agree we should evaluate if we really need to solve this based on fresh data points. @ericrk: Back-pressure originally came up as a possible area of improvement for Ganesh even without OOP-R, which we thought OOP-R would exacerbate. Do you recall any specific cases?
,
Jan 9
Nobody's come up with any suggestions for where this might be a problem and I don't think we have any good ideas for how to implement this. I'm going to WontFix this for now. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Feb 26 2018