New issue
Advanced search Search tips

Issue 786582 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner: ----
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocking:
issue 757605



Sign in to add a comment

Add metrics for oop raster

Project Member Reported by enne@chromium.org, Nov 17 2017

Issue description

Oop raster needs some metrics to demonstrate its performance improvements.  I see a few different categories.

(1) Thread time measurements.

oop raster moves cpu work from the renderer process to the gpu process and has a different set of data to be transported.  I think the clearest measurement that the system is now doing less work would be to measure both the renderer and gpu process cpu costs, with the assumption that the gpu costs will end up being the same.

https://chromium-review.googlesource.com/#/c/chromium/src/+/772823 is a hacky microbenchmark approach to the renderer side that renders through both paths and times the total time it takes on the cpu.

I think gpu time is a little bit trickier (especially if we want to compare tasks directly instead of in aggregate), but one possibility might be to add a start/end timing command buffer command that returns the cpu time between two tasks.

This would maybe make it possible to compare total time on a per task basis.

Another approach could just be to measure all the times independently and then compare numbers in aggregate.


(2) Integration test: throughput/parallelization

A big scheduling concern is that moving work from the renderer process (where the raster worker is doing work in parallel) to the gpu main thread could overload the gpu main thread and cause a reduction in parallelism and throughput in the system.

I think some thought needs to be put to how to measure this.  The oop raster prototype used motionmark to test this throughput, but maybe there are other things that should be tested here.


(3) Serialization/Deserialization microbenchmarks

This initially is less of a priority than the above (or if it turns out that serialization is not cheap and it's more cpu work to use oop raster than not, then it's a good followup to that).  It'd be good to have some microbenchmark framework to evaluate modifications to serialization and data structures with respect to performance.

See also:  issue 785434 .
 

Comment 1 by enne@chromium.org, Nov 17 2017

Components: Internals>Compositing>Rasterization

Comment 2 by enne@chromium.org, Apr 5 2018

Cc: khushals...@chromium.org
Status: Fixed (was: Available)
I feel like we're in a good place with metrics here.  These have all been added.

(1) ChromiumPerfFyi/android-n5x-perf-fyi/smoothness.oop_rasterization.top_25_smooth / frame_times
(2) ChromiumPerf/android-nexus5X/thread_times.key_mobile_sites_smooth
(3) PaintOpPerfTest

See: https://chromeperf.appspot.com/report?sid=ac3054f461aea181a7eb30e850a53f2f4cbec1f5378d0298cf6ce06c17321b90

Thread times also look good there.  Comparing average (raster + gpu cpu time), OOP-R is 3.43ms and GPU-R is 4.76.  Frame times are roughly the same, some slightly better some slightly worse.

Obviously these need to be rechecked once fonts have landed, but I think we're in a good place and there's not more implementation work required on the metrics side.

Sign in to add a comment