Oop raster needs some metrics to demonstrate its performance improvements. I see a few different categories.
(1) Thread time measurements.
oop raster moves cpu work from the renderer process to the gpu process and has a different set of data to be transported. I think the clearest measurement that the system is now doing less work would be to measure both the renderer and gpu process cpu costs, with the assumption that the gpu costs will end up being the same.
https://chromium-review.googlesource.com/#/c/chromium/src/+/772823 is a hacky microbenchmark approach to the renderer side that renders through both paths and times the total time it takes on the cpu.
I think gpu time is a little bit trickier (especially if we want to compare tasks directly instead of in aggregate), but one possibility might be to add a start/end timing command buffer command that returns the cpu time between two tasks.
This would maybe make it possible to compare total time on a per task basis.
Another approach could just be to measure all the times independently and then compare numbers in aggregate.
(2) Integration test: throughput/parallelization
A big scheduling concern is that moving work from the renderer process (where the raster worker is doing work in parallel) to the gpu main thread could overload the gpu main thread and cause a reduction in parallelism and throughput in the system.
I think some thought needs to be put to how to measure this. The oop raster prototype used motionmark to test this throughput, but maybe there are other things that should be tested here.
(3) Serialization/Deserialization microbenchmarks
This initially is less of a priority than the above (or if it turns out that serialization is not cheap and it's more cpu work to use oop raster than not, then it's a good followup to that). It'd be good to have some microbenchmark framework to evaluate modifications to serialization and data structures with respect to performance.
See also: issue 785434 .
Comment 1 by enne@chromium.org
, Nov 17 2017