New issue
Advanced search Search tips

Issue 798943 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Jan 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , Mac , Fuchsia
Pri: 3
Type: Bug

Blocking:
issue 796178



Sign in to add a comment

Add tool to measure zlib inflate / deflate performance

Project Member Reported by noel@chromium.org, Jan 4 2018

Issue description

We have image_decode_bench to help assess PNG decode/encode perf when making changes to third_party/zlib. However, there's no similar tool to measure GZIP/ZLIB wrapped DEFLATE stream decode/encode perf.  We should add one to help vet changes to zlib perf.

Could use the snappy test harness [1] as a basis, and make it a self-contained binary, one that any chromium developer could compile from zlib/BUILD.gn for their chrome build target (eg., Android), and run the resulting binary on a target device.

[1] https://gist.github.com/nigeltao/a3ef1761ad7339fac70a12a802467d56

Idea from issue  crbug.com/796178#c4  
 

Comment 1 Deleted

Comment 2 Deleted

Comment 3 Deleted

Comment 4 by noel@chromium.org, Jan 4 2018

Description: Show this description

Comment 5 by noel@chromium.org, Jan 4 2018

Status: Started (was: Untriaged)
Patch uploaded https://chromium-review.googlesource.com/850652 and an example:

% ninja -C out/Release zlib_bench
% ./out/Release/zlib_bench gzip ../../snappy/testdata/*
../../snappy/testdata/alice29.txt        :
GZIP: [b 1M] bytes 152089 ->  54397 35.8% comp  17.1 MB/s uncomp 370.8 MB/s
../../snappy/testdata/asyoulik.txt       :
GZIP: [b 1M] bytes 125179 ->  48927 39.1% comp  15.5 MB/s uncomp 337.3 MB/s
../../snappy/testdata/baddata1.snappy    :
GZIP: [b 1M] bytes  27512 ->  22920 83.3% comp  40.5 MB/s uncomp 155.1 MB/s
../../snappy/testdata/baddata2.snappy    :
GZIP: [b 1M] bytes  27483 ->  23000 83.7% comp  40.6 MB/s uncomp 160.0 MB/s
../../snappy/testdata/baddata3.snappy    :
GZIP: [b 1M] bytes  28384 ->  23705 83.5% comp  39.4 MB/s uncomp 159.4 MB/s
../../snappy/testdata/fireworks.jpeg     :
GZIP: [b 1M] bytes 123093 -> 122927 99.9% comp  45.6 MB/s uncomp 1046.9 MB/s
../../snappy/testdata/geo.protodata      :
GZIP: [b 1M] bytes 118588 ->  15124 12.8% comp  97.6 MB/s uncomp 925.7 MB/s
../../snappy/testdata/html               :
GZIP: [b 1M] bytes 102400 ->  13707 13.4% comp  68.7 MB/s uncomp 801.5 MB/s
../../snappy/testdata/html_x_4           :
GZIP: [b 1M] bytes 409600 ->  53285 13.0% comp  65.8 MB/s uncomp 795.8 MB/s
../../snappy/testdata/kppkn.gtb          :
GZIP: [b 1M] bytes 184320 ->  38727 21.0% comp  10.1 MB/s uncomp 487.4 MB/s
../../snappy/testdata/lcet10.txt         :
GZIP: [b 1M] bytes 426754 -> 144867 33.9% comp  17.3 MB/s uncomp 427.2 MB/s
../../snappy/testdata/paper-100k.pdf     :
GZIP: [b 1M] bytes 102400 ->  81276 79.4% comp  46.3 MB/s uncomp 355.7 MB/s
../../snappy/testdata/plrabn12.txt       :
GZIP: [b 1M] bytes 481861 -> 195102 40.5% comp  14.3 MB/s uncomp 379.1 MB/s
../../snappy/testdata/urls.10K           :
GZIP: [b 1M] bytes 702087 -> 222358 31.7% comp  39.0 MB/s uncomp 395.0 MB/s

Project Member

Comment 6 by bugdroid1@chromium.org, Jan 21 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/875ad5e3c08831f1efa91adade7fc350b1ef45bc

commit 875ad5e3c08831f1efa91adade7fc350b1ef45bc
Author: Noel Gordon <noel@chromium.org>
Date: Sun Jan 21 11:19:12 2018

zlib_bench: measure zlib encode and decode performance

Add a tool for measuring encode/decode performance of gzip, zlib,
and raw data encoded in DEFLATE compressed format.

Given a file containing any data, encode (compress) it into gzip,
zlib, or raw DEFLATE format (selected from the command line) then
decode (uncompress) the DEFLATE data.

Verify that the file data and the uncompressed data match. Output
the median and maximum encoding and decoding rates in MB/s.

Bug:  798943 
Change-Id: I6729a8e875452c6656bd16d5c798f5d1f3c12689
Reviewed-on: https://chromium-review.googlesource.com/850652
Commit-Queue: Noel Gordon <noel@chromium.org>
Reviewed-by: Chris Blume <cblume@chromium.org>
Cr-Commit-Position: refs/heads/master@{#530780}
[modify] https://crrev.com/875ad5e3c08831f1efa91adade7fc350b1ef45bc/third_party/zlib/BUILD.gn
[add] https://crrev.com/875ad5e3c08831f1efa91adade7fc350b1ef45bc/third_party/zlib/contrib/bench/zlib_bench.cc

Project Member

Comment 7 by bugdroid1@chromium.org, Jan 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/df50cac817b0e344c35aa3bbb8426cc65816fba9

commit df50cac817b0e344c35aa3bbb8426cc65816fba9
Author: Noel Gordon <noel@chromium.org>
Date: Tue Jan 23 13:58:48 2018

zlib bench: use fstream operator bool for failure detection

watk@ suggested off-line that the best way to test for an fstream
file failure is to use its operator bool: so use that.

Also #include <time.h> is not being used: so ditch it.

Bug:  798943 
Change-Id: I04fb23c0041da51face99055a2f482ba26bd23ca
Reviewed-on: https://chromium-review.googlesource.com/879861
Reviewed-by: Chris Watkins <watk@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Cr-Commit-Position: refs/heads/master@{#531222}
[modify] https://crrev.com/df50cac817b0e344c35aa3bbb8426cc65816fba9/third_party/zlib/contrib/bench/zlib_bench.cc

Comment 8 by noel@chromium.org, Jan 30 2018

Status: Fixed (was: Started)
So how to use this tool.  To compile in a chrome checkout run:

  % ninja -C out/Release zlib_bench

and then run it with some test data

  % out/Release/zlib_bench gzip|zlib|raw ../../snappy/testdata/*

To run on Android, cross-compile from linux with gn target_os="Android" (see the Android chrome build instructions).  Push the binary to your Android device

  % adb push out/Release/zlib_bench data/local/tmp

and push the test data:

  % adb push ../../snappy/testdata/* data/local/tmp/snappy

Shell into the android device and run:

  % adb shell
  dragon: ./data/local/tmp/zlib_bench gzip|zlib|raw ./data/local/tmp/snappy/*

Comment 9 by noel@chromium.org, Jan 31 2018

Cc: cavalcantii@chromium.org
Status: Started (was: Fixed)
This tool also resolves various measurement issues that came up on  issue 760853 

 - setting LD_LIBRARY_LOAD to point at chrome's zlib was error-prone
    due to the way chrome name mangles zlib symbols
 - hard to drive the snappy benchmark, https://github.com/google/snappy
    developers reported snappy results, rather than zlib results

The tool incorporates the core snappy benchmark verbatim, for comparison purposes with that benchmark, and builds against chrome's zlib so that we can compare zlib changes before/after.  No special scripts are needed (see #8) esp. for Android.

Comment 10 by noel@chromium.org, Jan 31 2018

Status: Fixed (was: Started)
Interesting paper to read [1] if you have time.  Nice stuff, but I'm not sure if the conclusions presented therein still hold, given the speed improvements we have made to chromium's zlib.  We have shown that DEFLATE can be a lot faster (like 2x faster for content-encoding: gzip decoding) on modern CPU.

[1] https://cran.r-project.org/web/packages/brotli/vignettes/brotli-2015-09-22.pdf

Quite interesting paper, thanks for sharing it.

Comment 12 by noel@chromium.org, Jan 31 2018

Yes, interesting.  zlib uses Huffman encoding, and it's an optimal method (achieves the Shannon-Fano bound, meaning you can not do better) when the source data symbol probabilities are a multiple of 1/2.

   P(source data symbol) = 1 / 2^k, for k = 1 ...

Huffman encoding is also pretty good for sources with other symbol probabilities, it won't be Shannon-Fano optimal though.  To get there, other methods are needed.

One is Arithmetic encoding: Shannon-Fano optimal, but it failed to gain traction in practice for other reasons (slow encoding speed when compared to Huffman, patent concerns, etc).

More recently, Finite State Entropy (FSE) encoders have appeared.  They combine the optimality of Arithmetic encoding for arbitrary source probability, with a Huffman-like compression/decompression stage.  Speed if very good as a result and they are now starting to appear in the public domain.

FSE decoding is very similar to Huffman decoding when implementations of both are compared.  A reasonable question to ask is: could Huffman decoding be tweaked to match or better FSE decoding speed?  The answer appears to be yes [1] [2]

[1] http://fastcompression.blogspot.com.au/2015/07/huffman-revisited-part-1.html
[2] http://fastcompression.blogspot.com.au/2015/07/huffman-revisited-part-2-decoder.html

Since the core algorithm of zlib DEFLATE is Huffman, these results suggest good performance gains could be had by similarly tweaking zlib's Huffman encode/decode code implementations.  One possible downside of such improvements is they might also require a change in the DEFLATE format specification (RFC 1951).  Probably not too hard to overcome, but something to consider.

Anyho, one good place to compare all these new-fancy compressors seems to be https://quixdb.github.io/squash-benchmark and, of course, zlib DEFLATE always features, and is compared to, due to its Huffman optimal encoding guarantees, industry acceptance, etc.  However, the zlib used is a "vanilla" zlib -- chrome's zlib is not "vanilla" and our results

  https://docs.google.com/spreadsheets/d/15b0-iT0sXB5_d8yN--y48dRhySkeFe3frjWIv7_Vivw/edit#gid=595603569

over the snappy corpus (it's similar to the Canterbury corpus) show a 2.17x decode speed improvement for content-encoding: gzip in the median.  If chrome's zlib was included at https://quixdb.github.io/squash-benchmark, I wonder where it would sit in all those comparison graphs ...

Comment 13 by noel@chromium.org, Feb 1 2018

Blocking: 796178

Comment 14 by noel@chromium.org, Feb 5 2018

Status: Verified (was: Fixed)
> Since the core algorithm of zlib DEFLATE is Huffman, ...

I had a good look over the zlib Huffman code, and I didn't see anything in the code
where I thought we could do better.  The improvements already made to code (chunk copy, which speeds up writing Huffman decoded data to the output), and in progress (Nigel's idea, improve speed when reading data into the Huffman decoder), tell us that zlib Huffman on modern CPU was I/O bound.  Improve that, and very good decode speed improvements are the result it seems.

Sign in to add a comment