Run dEQP, piglit and glbench on crosvm |
||||||
Issue descriptionRun dEQP and piglit test suites on crosvm.
,
Oct 19
Preliminary glbench results: https://docs.google.com/spreadsheets/d/1soaP06c8DJ01uLV8eaATjh4q_J8gcjPe1dl3KmKri7I/edit#gid=1487276513
,
Oct 19
Piglit instructions: sudo apt-get install piglit mkdir piglit-results cd piglit-results piglit run sanity results/sanity piglit run gpu results/gpu dEQP instructions: git clone https://android.googlesource.com/platform/external/deqp cd deqp python external/fetch_sources.py curl -L https://android-review.googlesource.com/tools/hooks/commit-msg -o .git/hooks/commit-msg && chmod a+x .git/hooks/commit-msg sudo apt-get install cmake ant sudo apt install gcc-multilib g++-multilib mkdir build-glx cd build-glx cmake .. -DCMAKE_BUILD_TYPE=Debug -DDEQP_TARGET=x11_glx make -j cd modules/gles2 ./deqp-gles2 --deqp-log-images=enable --deqp-watchdog=disable --deqp-surface-width=256 --deqp-surface-height=256 --deqp-log-filename=/tmp/gles2.qpa glbench instructions: sudo apt-get install libgflags-dev git clone https://github.com/waffle-gl/waffle cd waffle mkdir build cd build cmake .. make -j sudo make install cd ../.. git clone https://chromium.googlesource.com/chromiumos/third_party/autotest cd autotest/client/deps/glbench/src USE="X" graphics_backend="OPENGL" make cd .. ./glbench -notemp
,
Oct 19
,
Oct 19
Re c#2: A few potential differences, things to try: - z840 vs eve - qemu vs crosvm - guest mesa tot vs 13.0.6 I was also running crosvm with a non-release build. The results are marginally better with a result build. Will retry with ToT guest mesa next week.
,
Oct 19
Here some results with Nocturne as well (ToT guest Mesa): https://gitlab.freedesktop.org/virgl/virglrenderer/issues/46
,
Oct 20
,
Oct 22
Switching from default stretch mesa (13.06) to ToT doesn't seem to make much difference in glbench performance. Some profiling results (out.svg) of the glbench texture_update tests with perf/FlameGraph: guest: ./glbench -tests texture_update -notemp host: perf record -F 120 -p 3300 -g -- sleep 60 perf script > out.perf desktop: git clone https://github.com/brendangregg/FlameGraph ./stackcollapse-perf.pl /tmp/out.perf > /tmp/out.folded ./flamegraph.pl /tmp/out.folded >/tmp/out.svg
,
Oct 23
Investigate a bit with texture upload, specifically texture_upload_rgba_textimage2d_32. 10.23 mtexel_sec with eve/crosvm/tatl/stretch (debug, 13.0.6 mesa) vs 259.56 mtexel_sec on eve.
I modified glbench to reduce test to just that specific test and ran it for 10k iterations in the guest, along with some instrumented logs at the completion of every decode+process_gpu_command within process_descriptor. I ran a script (delta-time.py) to prepend each logline with the delta between the previous log line. (glbench-crosvm-commands.txt)
During that sequence, virtio gpu reported 8595 descriptors being processed, with 8435 of them being TransferToHost3d. I extracted a sequence of 3588 TransferToHost3d with no other commands processed in between (consecutive-transfers.txt)
Here's some typical sequences from that:
1.261ms 2018-10-22T17:16:44.978809-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (70.385µs)
0.816ms 2018-10-22T17:16:44.979625-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (73.654µs)
0.808ms 2018-10-22T17:16:44.980433-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (73.674µs)
0.771ms 2018-10-22T17:16:44.981204-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (73.031µs)
1.866ms 2018-10-22T17:16:44.983070-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (92.89µs)
1.171ms 2018-10-22T17:16:44.984241-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (91.938µs)
1.070ms 2018-10-22T17:16:44.985311-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (63.891µs)
snip
0.235ms 2018-10-22T17:16:45.132587-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (13.787µs)
0.230ms 2018-10-22T17:16:45.132817-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (13.686µs)
0.230ms 2018-10-22T17:16:45.133047-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (13.713µs)
0.223ms 2018-10-22T17:16:45.133270-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (12.83µs)
0.230ms 2018-10-22T17:16:45.133500-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (13.425µs)
0.237ms 2018-10-22T17:16:45.133737-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (13.567µs)
0.227ms 2018-10-22T17:16:45.133964-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (13.677µs)
snip
0.242ms 2018-10-22T17:16:45.700502-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (17.311µs)
0.077ms 2018-10-22T17:16:45.700579-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (12.017µs)
1.175ms 2018-10-22T17:16:45.701754-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (14.477µs)
0.238ms 2018-10-22T17:16:45.701992-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (18.31µs)
0.077ms 2018-10-22T17:16:45.702069-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (11.934µs)
1.193ms 2018-10-22T17:16:45.703262-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (13.968µs)
0.304ms 2018-10-22T17:16:45.703566-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (18.147µs)
0.096ms 2018-10-22T17:16:45.703662-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (15.203µs)
1.134ms 2018-10-22T17:16:45.704796-07:00 DEBUG localhos[16972]: crosvm[97]: [devices/src/virtio/gpu/mod.rs:384] Some(TransferToHost3d) -> OkNoData (15.196µs)
The entire elapsed wall-clock time from the first to the last transfer was 1011.936 ms. The total cumulative time spent decoding and processing gpu commands, was 63.469 ms. As a fraction of the overall wall-lock time 6.27% was spent actually decoding, processing, or calling into virgl_renderer_transfer_write_iov (ie very little).
➜ awk '{print $NF}' /tmp/consecutive-transfers.txt | perl -ne 's/^\D+(\d+\.\d+).*/\1/; $total += $1; printf "%s: %s\n", $1, $total' | tail -1
26.146: 63469.584
Distribution of processing times, in microseconds, 10us buckets. Fairly small variance, most of it right at the beginning after the previous non-TransferToHost3d command:
➜ maphimbu -s 1 </tmp/tt2 -d 10
10 3115
20 268
30 108
40 13
50 40
60 17
70 17
80 5
90 2
100 1
170 1
240 1
Distribution of invocation-to-invocation, in microseconds, 100us buckets. A lot more variation, including many that are in the 1000us range (when the actual work being down is <100us).
➜ maphimbu -s 1 </tmp/tt3 -d 100
0 96
100 1453
200 1443
300 187
400 74
500 68
600 17
700 18
800 21
900 15
1000 11
1100 98
1200 79
1300 6
1400 1
1800 1
My interpretation of this data is that the majority of time is being spent outside of virgl and the specific TransferToHost3d logic, either in the crosvm virtio queue or main virtio gpu loop, the guest kernel or the guest userspace.
,
Nov 2
,
Nov 9
Another datapoint. Ran webgl aquarium in Chrome launched from the container and I was seeing about 13-15 fps with 500 fish, vs ~55-60 fps run from the Chrome OS instance of Chrome. There's obviously a hit here, but it's nothing like the slowdown was experienced in supertuxkart. Also texture upload speed split off into another bug: crbug.com/898725.
,
Nov 14
I filed a bug on WebGL Aquarium slowness (probably affects all GPU-intensive apps though): https://gitlab.freedesktop.org/virgl/virglrenderer/issues/57
,
Dec 5
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by davidri...@chromium.org
, Oct 18