New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 613695 link

Starred by 3 users

Issue metadata

Status: Started
Owner:
Last visit 24 days ago
Cc:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug

Blocked on:
issue 629052



Sign in to add a comment

Tile iteration order is bad for ganesh tiles.

Project Member Reported by vmp...@chromium.org, May 20 2016

Issue description

The way the spiral iterator is implemented, we iterate tiles in a strict spiral ignoring the fact that some tiles have aspect ratios != 1. 

For ganesh tiles (that are wide but not that tall), this causes us to try and process tiles to the left/right of the viewport before tiles below and above the viewport even if above/below tiles are closer to the viewport. 

Note that this problem is somewhat alleviated by the fact that we still respect the rect that we're iterating. That is, if we're iterating the soon rect and the left/right tiles are outside of it, we will skip them.

In other words, we still iterate only the tiles within a given rect, but the ordering of those tiles in there is not ideal.
 
Cc: prashan...@samsung.com
+prashant, in case you're interested in looking at this. 
Owner: prashan...@samsung.com
If this is not too urgent I'll take a look at this as I'll need some time understand the current solution.
Suppose the example given below. The tiling data as (gfx::Size(10, 10), gfx::Size(30, 30), false) and center at rect (15, 15, 1, 1). Suppose tiles at locations 1, 2, 3, 7 have texture size 8x8 and tiles at locations 4,5,6 have texture size 10*8 (wide tiles).

Then is the sequence expected to be 1, 2, 3, 7, 8, 4, 5, 6?

   x 0 1 2
  y.------
  0| 4 3 2
  1| 5 * 1
  2| 6 7 8
Read the above example as (gfx::Size(10, 8), gfx::Size(30, 30), false), so that all tiles have max texture size as 10*8.

Comment 5 by vmp...@chromium.org, Jun 13 2016

If the tiles are "square", ie the ratio of width/height is 1, then the iteration order in #3 is correct. However, when we have ganesh tiles they are sized as full viewport width, and 1/4 of viewport height so we typically end up with wide tiles, something like 1000x250. In that case, you get the following:

 x  0  1  2  3  4
y.--------------
 0|16 15 14 13 12
 1|17  4  3  2 11
 2|18  5  *  1 10
 3|19  6  7  8  9
 4|20 21 22 23 24

However, for example tiles 17, 18, 19 are much further away than 21, 22, 23, which means we're iterating tiles that are much further away from the center than we want.. the ideal iteration would be something like:

 x  0  1  2  3  4
y.--------------
 0|20 11 10  9 19
 1|21  4  3  2 18
 2|22  5  *  1 17
 3|23  6  7  8 16
 4|24 12 13 14 15

That is, considering that the tiles are smaller in the y direction, we'd consider iterating those tiles before proceeding further out in the x direction. How much further? My initial thought is to use height/width or width/height as that proxy (whichever is bigger). So in the case listed above, you'd get a ratio of 4, so you can go 4 rows in y direction before extending the x direction.

This is probably a fairly complicated patch (but small), but the idea is to retain a roughtly distance increasing iteration as much as possible. We'll need a lot of unittests for this as well.
Thank you vmpstr@ for clear explaination. I'll work on it and add a patch with unittests.
Status: Started (was: Assigned)
Blockedon: 629052
https://codereview.chromium.org/2067213002/ is added for the review.
Project Member

Comment 10 by bugdroid1@chromium.org, Sep 24 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/051927915d98be61b03119b7bce2301d33829bcf

commit 051927915d98be61b03119b7bce2301d33829bcf
Author: prashant.n <prashant.n@samsung.com>
Date: Sat Sep 24 03:38:20 2016

cc: Detach spiral iterator implementation to separate file.

Currently SpiralDifferenceIterator implements logic based on directions
around the center rect. This logic does not take care of aspect ratios
of tile size. To consider the aspect ratios during iteration new logic
has to be implanted. Until the new logic gets tested thoroughly we
need the current logic. This patch detaches the current spiral iterator
to separate classes (SpiralIterator and ReverseSpiralIterator) and
moves them to their own files making SpiralDifferenceIterator thinner.
Now new logic can be added to SpiralDifferenceIterator based on some
runtime switch.

This patch also separates out unit tests related to SpiralIterator from
tiling_data_unittest.cc to its own file.

BUG=613695
TESTS=TilingDataSpiralIteratorTest.*
CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel

Review-Url: https://codereview.chromium.org/2352393002
Cr-Commit-Position: refs/heads/master@{#420823}

[modify] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/BUILD.gn
[modify] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/BUILD.gn
[add] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/reverse_spiral_iterator.cc
[add] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/reverse_spiral_iterator.h
[add] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/spiral_iterator.cc
[add] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/spiral_iterator.h
[add] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/spiral_iterator_unittest.cc
[modify] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/tiling_data.cc
[modify] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/tiling_data.h
[modify] https://crrev.com/051927915d98be61b03119b7bce2301d33829bcf/cc/base/tiling_data_unittest.cc

For considering aspect ratios, https://codereview.chromium.org/2498383002/ is added which implements tiling iterator based on pyramid sequence. The iterator cost results are as given below (measured on Ubuntu 14.04, i7 4 cores machine).

With pyramid sequence

*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_16= 137947.578125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_32= 76511.7109375 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_64= 47162.99609375 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_128= 27428.8203125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_16= 67154.8359375 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_32= 56307.26953125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_64= 32311.705078125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_128= 17798.923828125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_16= 16145.591796875 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_32= 15190.4052734375 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_64= 11827.1142578125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_128= 9451.248046875 runs/s

*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_16= 102962.8359375 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_32= 62303.69140625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_64= 42613.2109375 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_128= 23738.669921875 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_16= 44010.2890625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_32= 36766.83984375 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_64= 25923.08203125 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_128= 16826.423828125 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_16= 11491.5009765625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_32= 10769.197265625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_64= 9052.876953125 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_128= 7199.39892578125 runs/s


With spiral iterator (without pyramid sequence)

*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_16= 169157.375 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_32= 97946.765625 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_64= 57101.88671875 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 2_128= 32033.173828125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_16= 69190.3671875 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_32= 58077.4453125 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_64= 38567.109375 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 10_128= 22744.51171875 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_16= 16227.841796875 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_32= 15297.0556640625 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_64= 12956.171875 runs/s
*RESULT tile_manager_raster_tile_queue_construct_and_iterate: 50_128= 10680.6796875 runs/s

*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_16= 137091.234375 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_32= 75950.75 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_64= 50263.96875 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 2_128= 27047.78125 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_16= 78565.9140625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_32= 58623.97265625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_64= 37526.19140625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 10_128= 22035.119140625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_16= 24779.541015625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_32= 21959.9453125 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_64= 17531.625 runs/s
*RESULT tile_manager_eviction_tile_queue_construct_and_iterate: 50_128= 12932.736328125 runs/s

The new algorithm is little-bit slower than previous spiral iterator algorithm (for less number of tiles), but it has a lot of flexibility like considering tile aspect ratio, precedence of coverage directions, manhattan distance like iteration order, etc.
It looks like the iteration is 10-20% slower. I don't think that's acceptable for the relatively minor feature set. For context, there is some talk about unifying ganesh tiles to be square as well, which would mean this series of patches would only regress performance.

Is there a way that we can use the current iterator code and only slightly adjust it to consider aspect ratio? I understand that the general solution has added benefits, but I don't think the performance hit is worth it.
The perf tests mentioned in comment #11 checks the algorithmic time for pyramid sequence over spiral iterator.

I've improved this by 5% by sharing the iterator object throughout iteration. Now it is ~ 5%-15% slower.

Even tough pyramid sequence is slower, the iteration order returned by it is better than spiral iterator. I triggered few smoothness tests and on those pyramid sequence looks to be little better. Give below are the results for smoothness tests.

1. smoothness.key_silk_cases

https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-02-21_04-03-35

2. smoothness.tough_scrolling_cases

https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-02-21_04-06-38

3. smoothness.tough_scrolling_cases

https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-02-21_04-03-31
In above comment, the smoothness tests are triggered on https://codereview.chromium.org/2067213002 , patch set 77 (rebase -> used for smoothness tests).
The results for smoothness.key_silk_cases are as given below.

                                        TOT             Patch
avg_surface_fps                         52.607          53.444
first_gesture_scroll_update_latency     10.678 ms       10.647 ms
frame_lengths                           1.082           1.09
frame_time_discrepancy                  141.217 ms      151.576 ms
frame_times                             18.041 ms       18.159 ms
input_event_latency_discrepancy         111.290 ms      103.630 ms
jank_count                              4               4
main_thread_scroll_latency_discrepancy  307.816 ms      282.097 ms
max_frame_delay                         6.571           7.111
mean_frame_time                         20.720 ms       20.173 ms
mean_input_event_latency                29.581 ms       24.644 ms
mean_main_thread_scroll_latency         18.000 ms       17.626 ms
mean_pixels_approximated                16.782%         17.633%
mean_pixels_checkerboarded              0.350%          0.196%
percentage_smooth                       92.046          91.835
queueing_durations                      0.213 ms        0.204 ms
I triggered few more tests and results are as given below. (All the above and below tests are done android-nexus6 device.) For many of the histograms, patch is better than TOT. Even if there is not much significant improvement, at least patch is not worsening the perf. With added advantages of the pyramid sequence it would outweigh the spiral iterator, IMO.

P.S. * means patch is worse than TOT for the given histogram.

1. smoothness.sync_scroll.key_mobile_sites_smooth

https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-02-23_00-52-44

                                        TOT         Patch
avg_surface_fps                         56.641      56.692
first_gesture_scroll_update_latency     45.342 ms   42.528 ms
frame_lengths                           1.044       1.043
frame_time_discrepancy                  88.571 ms   87.743 ms
frame_times                             17.402 ms   17.384 ms
input_event_latency_discrepancy         434.652 ms  452.524 ms  *
jank_count                              3           2
max_frame_delay                         3.487       3.538       *
mean_frame_time                         17.821 ms   17.851 ms   *
mean_input_event_latency                13.952 ms   13.883 ms
mean_pixels_approximated                0.000%      0.000%
mean_pixels_checkerboarded              0.264%      0.256%
percentage_smooth                       94.953      95.264
queueing_durations                      0.603 ms    0.594 ms


2. smoothness.key_silk_cases

https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-02-22_06-19-45

                                            TOT         Patch
avg_surface_fps                             52.179      53.321
first_gesture_scroll_update_latency         10.371 ms   11.224 ms  *
frame_lengths                               1.087       1.084
frame_time_discrepancy                      142.465 ms  139.433 ms
frame_times                                 18.110 ms   18.073 ms
input_event_latency_discrepancy             172.892 ms  91.092 ms
jank_coun                                   4           4
main_thread_scroll_latency_discrepancy      627.640 ms  283.048 ms
max_frame_delay                             6.357       6.357
mean_frame_time                             20.725 ms   19.928 ms
mean_input_event_latency                    27.873 ms   29.870 ms  *
mean_main_thread_scroll_latency             18.889 ms   17.752 ms
mean_pixels_approximated                    17.154%     17.004%
mean_pixels_checkerboarded                  0.196%      0.161%
percentage_smooth                           90.835      91.367
queueing_durations                          0.208 ms    0.206 ms

3. smoothness.gpu_rasterization.top_25_smooth

https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-02-22_06-13-37

                                            TOT             Patch
avg_surface_fps                             57.174          57.696
first_gesture_scroll_update_latency         14.597 ms       17.657 ms    *
frame_lengths                               1.045           1.04
frame_time_discrepancy                      75.393 ms       67.510 ms
frame_times                                 17.417 ms       17.340 ms
input_event_latency_discrepancy             262.444 ms      227.050 ms
jank_count                                  2               1
main_thread_scroll_latency_discrepancy      2,467.497 ms    2,266.334 ms
max_frame_delay                             3               2.87
mean_frame_time                             17.471 ms       17.356 ms
mean_input_event_latency                    9.701 ms        9.732 ms     *
mean_main_thread_scroll_latency             79.559 ms       81.632 ms    *
mean_pixels_approximated                    0.000%          0.000%
mean_pixels_checkerboarded                  5.643%          5.287%
percentage_smooth                           96.446          96.615
queueing_durations                          2.505 ms        2.493 ms

Sign in to add a comment