Discovered when doing some size measuring that gzip is particularly bad at compression small amounts of data. E.g. if you compress 1000 15 byte strings separately vs compressing them all as one string, the difference is 30-40%.
My theory about this is that it's because gzip does not have a pre-defined dictionary from which to draw references from, and so has to build up a new dictionary for each string.
Brotli, on the other hand, has a web-optimized predefined dictionary, so should be quite good at compressing shorter strings.
One of the thoughts around using gzip rather than brotli, was that gzip might diff better for patch updates, thanks to its --rsyncable flag. In practice though, I don't think we have any resources that are even large enough for that to matter.
Comment 1 by benhenry@chromium.org
, Aug 1