Add tool to measure zlib inflate / deflate performance |
||||||||
Issue descriptionWe have image_decode_bench to help assess PNG decode/encode perf when making changes to third_party/zlib. However, there's no similar tool to measure GZIP/ZLIB wrapped DEFLATE stream decode/encode perf. We should add one to help vet changes to zlib perf. Could use the snappy test harness [1] as a basis, and make it a self-contained binary, one that any chromium developer could compile from zlib/BUILD.gn for their chrome build target (eg., Android), and run the resulting binary on a target device. [1] https://gist.github.com/nigeltao/a3ef1761ad7339fac70a12a802467d56 Idea from issue crbug.com/796178#c4
,
Jan 4 2018
,
Jan 4 2018
Patch uploaded https://chromium-review.googlesource.com/850652 and an example: % ninja -C out/Release zlib_bench % ./out/Release/zlib_bench gzip ../../snappy/testdata/* ../../snappy/testdata/alice29.txt : GZIP: [b 1M] bytes 152089 -> 54397 35.8% comp 17.1 MB/s uncomp 370.8 MB/s ../../snappy/testdata/asyoulik.txt : GZIP: [b 1M] bytes 125179 -> 48927 39.1% comp 15.5 MB/s uncomp 337.3 MB/s ../../snappy/testdata/baddata1.snappy : GZIP: [b 1M] bytes 27512 -> 22920 83.3% comp 40.5 MB/s uncomp 155.1 MB/s ../../snappy/testdata/baddata2.snappy : GZIP: [b 1M] bytes 27483 -> 23000 83.7% comp 40.6 MB/s uncomp 160.0 MB/s ../../snappy/testdata/baddata3.snappy : GZIP: [b 1M] bytes 28384 -> 23705 83.5% comp 39.4 MB/s uncomp 159.4 MB/s ../../snappy/testdata/fireworks.jpeg : GZIP: [b 1M] bytes 123093 -> 122927 99.9% comp 45.6 MB/s uncomp 1046.9 MB/s ../../snappy/testdata/geo.protodata : GZIP: [b 1M] bytes 118588 -> 15124 12.8% comp 97.6 MB/s uncomp 925.7 MB/s ../../snappy/testdata/html : GZIP: [b 1M] bytes 102400 -> 13707 13.4% comp 68.7 MB/s uncomp 801.5 MB/s ../../snappy/testdata/html_x_4 : GZIP: [b 1M] bytes 409600 -> 53285 13.0% comp 65.8 MB/s uncomp 795.8 MB/s ../../snappy/testdata/kppkn.gtb : GZIP: [b 1M] bytes 184320 -> 38727 21.0% comp 10.1 MB/s uncomp 487.4 MB/s ../../snappy/testdata/lcet10.txt : GZIP: [b 1M] bytes 426754 -> 144867 33.9% comp 17.3 MB/s uncomp 427.2 MB/s ../../snappy/testdata/paper-100k.pdf : GZIP: [b 1M] bytes 102400 -> 81276 79.4% comp 46.3 MB/s uncomp 355.7 MB/s ../../snappy/testdata/plrabn12.txt : GZIP: [b 1M] bytes 481861 -> 195102 40.5% comp 14.3 MB/s uncomp 379.1 MB/s ../../snappy/testdata/urls.10K : GZIP: [b 1M] bytes 702087 -> 222358 31.7% comp 39.0 MB/s uncomp 395.0 MB/s
,
Jan 21 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/875ad5e3c08831f1efa91adade7fc350b1ef45bc commit 875ad5e3c08831f1efa91adade7fc350b1ef45bc Author: Noel Gordon <noel@chromium.org> Date: Sun Jan 21 11:19:12 2018 zlib_bench: measure zlib encode and decode performance Add a tool for measuring encode/decode performance of gzip, zlib, and raw data encoded in DEFLATE compressed format. Given a file containing any data, encode (compress) it into gzip, zlib, or raw DEFLATE format (selected from the command line) then decode (uncompress) the DEFLATE data. Verify that the file data and the uncompressed data match. Output the median and maximum encoding and decoding rates in MB/s. Bug: 798943 Change-Id: I6729a8e875452c6656bd16d5c798f5d1f3c12689 Reviewed-on: https://chromium-review.googlesource.com/850652 Commit-Queue: Noel Gordon <noel@chromium.org> Reviewed-by: Chris Blume <cblume@chromium.org> Cr-Commit-Position: refs/heads/master@{#530780} [modify] https://crrev.com/875ad5e3c08831f1efa91adade7fc350b1ef45bc/third_party/zlib/BUILD.gn [add] https://crrev.com/875ad5e3c08831f1efa91adade7fc350b1ef45bc/third_party/zlib/contrib/bench/zlib_bench.cc
,
Jan 23 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/df50cac817b0e344c35aa3bbb8426cc65816fba9 commit df50cac817b0e344c35aa3bbb8426cc65816fba9 Author: Noel Gordon <noel@chromium.org> Date: Tue Jan 23 13:58:48 2018 zlib bench: use fstream operator bool for failure detection watk@ suggested off-line that the best way to test for an fstream file failure is to use its operator bool: so use that. Also #include <time.h> is not being used: so ditch it. Bug: 798943 Change-Id: I04fb23c0041da51face99055a2f482ba26bd23ca Reviewed-on: https://chromium-review.googlesource.com/879861 Reviewed-by: Chris Watkins <watk@chromium.org> Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org> Cr-Commit-Position: refs/heads/master@{#531222} [modify] https://crrev.com/df50cac817b0e344c35aa3bbb8426cc65816fba9/third_party/zlib/contrib/bench/zlib_bench.cc
,
Jan 30 2018
So how to use this tool. To compile in a chrome checkout run: % ninja -C out/Release zlib_bench and then run it with some test data % out/Release/zlib_bench gzip|zlib|raw ../../snappy/testdata/* To run on Android, cross-compile from linux with gn target_os="Android" (see the Android chrome build instructions). Push the binary to your Android device % adb push out/Release/zlib_bench data/local/tmp and push the test data: % adb push ../../snappy/testdata/* data/local/tmp/snappy Shell into the android device and run: % adb shell dragon: ./data/local/tmp/zlib_bench gzip|zlib|raw ./data/local/tmp/snappy/*
,
Jan 31 2018
This tool also resolves various measurement issues that came up on issue 760853 - setting LD_LIBRARY_LOAD to point at chrome's zlib was error-prone due to the way chrome name mangles zlib symbols - hard to drive the snappy benchmark, https://github.com/google/snappy developers reported snappy results, rather than zlib results The tool incorporates the core snappy benchmark verbatim, for comparison purposes with that benchmark, and builds against chrome's zlib so that we can compare zlib changes before/after. No special scripts are needed (see #8) esp. for Android.
,
Jan 31 2018
Interesting paper to read [1] if you have time. Nice stuff, but I'm not sure if the conclusions presented therein still hold, given the speed improvements we have made to chromium's zlib. We have shown that DEFLATE can be a lot faster (like 2x faster for content-encoding: gzip decoding) on modern CPU. [1] https://cran.r-project.org/web/packages/brotli/vignettes/brotli-2015-09-22.pdf
,
Jan 31 2018
Quite interesting paper, thanks for sharing it.
,
Jan 31 2018
Yes, interesting. zlib uses Huffman encoding, and it's an optimal method (achieves the Shannon-Fano bound, meaning you can not do better) when the source data symbol probabilities are a multiple of 1/2. P(source data symbol) = 1 / 2^k, for k = 1 ... Huffman encoding is also pretty good for sources with other symbol probabilities, it won't be Shannon-Fano optimal though. To get there, other methods are needed. One is Arithmetic encoding: Shannon-Fano optimal, but it failed to gain traction in practice for other reasons (slow encoding speed when compared to Huffman, patent concerns, etc). More recently, Finite State Entropy (FSE) encoders have appeared. They combine the optimality of Arithmetic encoding for arbitrary source probability, with a Huffman-like compression/decompression stage. Speed if very good as a result and they are now starting to appear in the public domain. FSE decoding is very similar to Huffman decoding when implementations of both are compared. A reasonable question to ask is: could Huffman decoding be tweaked to match or better FSE decoding speed? The answer appears to be yes [1] [2] [1] http://fastcompression.blogspot.com.au/2015/07/huffman-revisited-part-1.html [2] http://fastcompression.blogspot.com.au/2015/07/huffman-revisited-part-2-decoder.html Since the core algorithm of zlib DEFLATE is Huffman, these results suggest good performance gains could be had by similarly tweaking zlib's Huffman encode/decode code implementations. One possible downside of such improvements is they might also require a change in the DEFLATE format specification (RFC 1951). Probably not too hard to overcome, but something to consider. Anyho, one good place to compare all these new-fancy compressors seems to be https://quixdb.github.io/squash-benchmark and, of course, zlib DEFLATE always features, and is compared to, due to its Huffman optimal encoding guarantees, industry acceptance, etc. However, the zlib used is a "vanilla" zlib -- chrome's zlib is not "vanilla" and our results https://docs.google.com/spreadsheets/d/15b0-iT0sXB5_d8yN--y48dRhySkeFe3frjWIv7_Vivw/edit#gid=595603569 over the snappy corpus (it's similar to the Canterbury corpus) show a 2.17x decode speed improvement for content-encoding: gzip in the median. If chrome's zlib was included at https://quixdb.github.io/squash-benchmark, I wonder where it would sit in all those comparison graphs ...
,
Feb 1 2018
,
Feb 5 2018
> Since the core algorithm of zlib DEFLATE is Huffman, ... I had a good look over the zlib Huffman code, and I didn't see anything in the code where I thought we could do better. The improvements already made to code (chunk copy, which speeds up writing Huffman decoded data to the output), and in progress (Nigel's idea, improve speed when reading data into the Huffman decoder), tell us that zlib Huffman on modern CPU was I/O bound. Improve that, and very good decode speed improvements are the result it seems. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 Deleted