Optimize Adler-32 checksum |
|||||||||||
Issue descriptionLoading PNGs require to verify the uncompressed data using Adler-32 (in zlib) and could be made much faster using SIMD instructions.
,
Feb 4 2017
,
Feb 4 2017
,
Feb 4 2017
I'm using the test case available at: http://codepen.io/Savago/pen/VPeQaX Follow a screenshot of a trace collected in a Nexus 4.
,
Feb 4 2017
The trace files.
,
Feb 4 2017
Disclaimers: This is only 1 data point (in 1 device) in 1 large PNG file (850KB). YMMV. Comparing the CPU self time: >>> 1 - (188.463/232.320) 0.18877840909090904 Or 18% improvement. Comparing the 'Wall Duration' we have: >>> 1 - (191.821/277.552) 0.30888265982590657 Or 30% improvement. It will definitely also vary depending on how the image was encoded (i.e. how many times the checksum will be called and the length of the byte array being checked). Not mention if the code is running in the big or little core, CPU freq, thermal, behavior of EAS (Energy Aware Scheduler), etc, etc.
,
Feb 4 2017
Ideally I would like to repeat the test in an ARMv8 device (e.g. Pixel).
,
Feb 4 2017
Initial patch at: https://codereview.chromium.org/2676493007/
,
Feb 6 2017
,
Mar 18 2017
,
Apr 3 2017
For reference, using all the 3 patches linked to this issue will yield a performance boost around 40% in PNG image decoding speed (using a Nexus 6). The image shows the traces for a test page (http://codepen.io/Savago/pen/VPeQaX) where we compare Chromium m59 vanilla X patched (using the 3 optimizations resulting from the PNG investigation: Adler32, inflate_fast, palette). It is interesting to see that the time spent decoding the image dropped from 116.187ms to 73.844ms (an improvement of around 40%). It is interesting to see that now GPUImageDecodeCache::DecodeImage() will take longer to execute (94.025ms) than actually decoding the image in ImageFrameGenerator::decode(). Is the image cache compressed? I wonder if we could make it faster? The Adler32 optimization was submitted to zlib-ng and is waiting for review/merge.
,
Apr 12 2017
Update: the optimization was submitted to Canonical zlib on https://github.com/madler/zlib/pull/251
,
Apr 25 2017
zlib-ng has merged the optimization in their development branch: https://github.com/Dead2/zlib-ng/commit/ec02ecf104e1d3f1836a908a359f20aa93494df5
,
Aug 11 2017
,
Aug 11 2017
,
Aug 11 2017
Traces collected in a Pixel (Snapdragon 821).
,
Aug 11 2017
All patches combined (Pixel SnapDragon 821).
,
Aug 12 2017
Updated patch: https://chromium-review.googlesource.com/c/611492
,
Sep 11 2017
,
Sep 12 2017
,
Oct 3 2017
Merged with: https://bugs.chromium.org/p/chromium/issues/detail?id=762564 Code landed on: https://chromium.googlesource.com/chromium/src/+/09b784fd12f255a9da38107ac6e0386f4dde6d68
,
Oct 3 2017
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by cavalcantii@chromium.org
, Feb 4 2017Owner: cavalcantii@chromium.org
Status: Started (was: Untriaged)