Investigate zstd for JS source string compression |
||
Issue descriptionzstd is faster and/or smaller than zlib, per online benchmarks (see https://quixdb.github.io/squash-benchmark/unstable/ for instance). Investigate it for JS source string compression.
,
Dec 12
,
Dec 12
zstd is pretty awesome and should have outstanding performance. My 2 cents to the discussion: a) Just ensure that you are validating the results using Chrome's zlib (as that has extensive optimization work on both ARM & x86). b) It may be interesting to *also* validate the results on ARM devices (specially that they are the ones that memory pressure may be more significant).
,
Dec 13
Yes, absolutely :) Before making any decision, we will likely reproduce the brotli evaluation with zstd, that is using the same parameters we use in Chrome, and testing on both ARM and x86. See https://bugs.chromium.org/p/chromium/issues/detail?id=907489 for details regarding brotli.
,
Dec 13
Nice! Just double checking, when you build zstd 'with the default options', it will link with which zlib? System's zlib or Chromium's zlib? Another consideration: contributing to zstd requires to sign the Facebook CLA (https://github.com/facebook/zstd/blob/dev/CONTRIBUTING.md). There is a patent clause in the license* that is a major blocker for us, at ARM, to be able to contribute to it (unlike zlib or brotli). Maybe that is not a concern for google or the Chromium project, but I thought it would be interesting to point it out. *The patent clause: "You hereby grant to Facebook and to recipients of software distributed by Facebook a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import..."
,
Jan 17
(5 days ago)
Did the same benchmarking done for zlib and brotli with zstd as well. See the full data and results attached. For googlers, link to the full data: https://colab.corp.google.com/drive/177uyTNTlpx3H0MoLuqTmC0tsbp_kQ-01#scrollTo=q_RUrVD0AuH3 For others, the same data is attached to this update. tl;dr: With ~ the same compression ratio as zlib, zstd is: - Faster for compression and decompression on Linux x86_64 - Faster for compression, and slower for decompression on Android (Pixel 3XL). The compression speed difference is large on both Android and Linux, but decompression is significantly slower on Android (283MB/s vs 198MB/s for a 128kB chunk size, for instance). This is likely due to the optimization work done for zlib on ARM, as we used Chromium's defaults for all libraries. Together with the binary size cost of zstd (needs to be evaluated in more details, but non-trivial), zstd is not necessarily attractive on Android, but may be on desktop for JS string compression.
,
Jan 17
(5 days ago)
Fascinating data, thanks for sharing it. I got one question: for zlib compression, which compression level are you using? I noticed in my tests that the sweet spot seems to be compression '3', when factoring compression speed x compression ratio.
,
Jan 17
(5 days ago)
Nevermind, just read it in the report: "Zlib: level 6 (default)". Would be possible to repeat the experiment with level 3?
,
Jan 17
(5 days ago)
> This is likely due to the optimization work done for zlib on ARM, as we used Chromium's defaults for all libraries. Yes, all the patches contributed by ARM pretty much doubled the performance of zlib in decompression. For further details: a) Zlib only: https://goo.gl/vaZA9o b) PNG decoding: https://docs.google.com/presentation/d/1vX3Ue2RRLM4Wuopx4QuFJyvSDuxCNHJ-7ZxJpqwdeHQ/edit#slide=id.g36f45ded64_0_0
,
Jan 17
(5 days ago)
For compression, I only worked on it for 3 weeks but was able to improve it in average in 36% compared to vanilla zlib. Data: https://goo.gl/qLVdvh
,
Jan 18
(4 days ago)
re: #8 Attached are results with various compression levels. Indeed 3 is much faster to compress than 6, but at the cost of a lower compression ratio. We may use 3 on low-end Android then, thanks for the tip!
,
Jan 18
(4 days ago)
@lizeb: thanks for repeating the experiment and sharing the data, I really appreciate it. The reason for the suggested compression level being way faster is that some of optimizations I've implemented for ARM (i.e. insert_string_arm()) are in the fast path (i.e. deflate_fast), used for lower compression levels. I haven't looked in deep on the slow path, my feeling is that it could potentially be improved. To be quite honest, I've asked myself if it was worth the effort due to the fact that most of the time the browser uses zlib in the other way (i.e. doing decompression of gzipped webpages and decoding PNGs), where I dedicated more time to improve it. It is quite interesting to learn that compression may be a relevant use case for Chromium usage of zlib. A few questions: a) May I ask how much RAM are you able to save by compressing JS source strings? b) Does it allow to have more tabs open? c) How does it interact with use of zram in CrOS and Android?
,
Jan 18
(4 days ago)
Another (obvious) reason for level 3 being way faster is that it does less work while performing the compression job than, say, level 6.
,
Today
(9 hours ago)
Another question: while running inside the browser, what is the priority of the compression task? I'm not familiar with how background tasks are scheduled, but if they are low priority, it may be the case that the kernel will schedule them to run in a little core (in a big.LITTLE system). If that is the case, the compression speed numbers will be smaller (e.g. around 1.5x to 1.9x slower) than the data you observed. |
||
►
Sign in to add a comment |
||
Comment 1 by lizeb@chromium.org
, Dec 7