tune mojo datapipe used for blob and service worker script reading |
|||||||
Issue descriptionWe have some reports that reading a response that was provided from cache_storage is slower than a response that is loaded from networking's http cache. I asked mek@ about it today and he pointed out that networking tunes its mojo datapipe to a 512kb buffer. It looks like blob reading is probably using the default 64kb buffers. Also, it looks like the blob reading code fills the entire pipe buffer before notifying the renderer that any data is available. This could result in some ping-pong behavior between the processes with reduced parallelism. To address this we can also tune the blob reading code to use read in chunk sizes smaller than the pipe buffer. This bug will involve adding code to tune these values for blob reading and then perform some finch experiments to see if they improve performance measures.
,
Sep 6
This synthetic stress test: https://cache-api-perf.glitch.me/parallel.html?jobs=20&entries=100&type=path&bodies=20&reports=20 Shows this result for a current default on my VM: matchesPerSec - min: 1737 mean: 1972 max: 2264 bodyMegaBytesPerSec - min: 110 mean: 124 max: 137 With the 512kb pipe buffer and 32kb chunk size: matchesPerSec - min: 503 mean: 628 max: 670 bodyMegaBytesPerSec - min: 169 mean: 209 max: 223 It seems we are able to better saturate read rate with the tuned values. In this test this results in less time for cache matching, so the matchesPerSec drops.
,
Sep 7
This same problem seems to affect service worker script loading as well: https://cs.chromium.org/chromium/src/content/browser/service_worker/service_worker_installed_script_reader.cc?l=121&rcl=b1b8eb459ad1d34ffe35818f1229c14dc0018628 I'll look at tuning that here as well unless the service worker team would prefer to do it in a separate bug. Loading these scripts should be very similar to reading scripts out of cache_storage blobs.
,
Sep 8
Another optimization we can make here is to create and start filling the response DataPipe immediately in the service worker thread: https://chromium-review.googlesource.com/c/chromium/src/+/1214684 In the current code the service worker sends the GotResponse() mojo message back to the main thread. The main thread then starts to load the blob. If the main thread is busy with js then the start of the blob reading can be delayed for long periods. By starting the read immediately we can at least fill the DataPipe if the main thread is janking.
,
Sep 8
,
Sep 8
Note the CL in #c5 basically disables side data reading. I'm not sure the best way to support that. It would be a lot easier if it was read via a DataPipe instead of a separate message that produced a buffer callback.
,
Sep 10
(re-posting from my chromium account) re c#5: I'm a bit worried that reading the blob data in the worker thread potentially makes service worker thread busier. Instead of reading it in the service worker thread, ServiceWorkerSubresourceLoader (which dispatches a FetchEvent to the service worker thread on a background thread) might be able to call ReadAll on the background thread and send the datapipe back to the main thread.
,
Sep 10
,
Sep 10
I split #c5 into a separate bug. See bug 882425 .
,
Sep 11
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8 commit 25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8 Author: Ben Kelly <wanderview@chromium.org> Date: Thu Sep 13 20:29:39 2018 Tune the mojo::DataPipe capacity and read chunk size when reading blobs. We have reports from some sites that data loaded from cache_storage via a service worker is slower than when loaded from http cache. One possible explanation is that the mojo::DataPipes used for reading the BlobDataHandle from cache_storage are not tuned like the DataPipes used by the network loading path. In the network loading path they use 512KB pipes and read in 32KB chunks. This CL attempts to tune the various DataPipes used to read BlobDataHandle objects. It also includes some DataPipes in the service worker code that are read in a similar way; for loading scripts, ReadableStreams, etc. This CL uses 512KB pipes and 64KB read chunks as the default as those were the best values in local testing. It also adds features to control these values in a future finch trial. We can compare different values against heartbeat metrics and site-specific metrics. R=kinuko@chromium.org, mek@chromium.org, rkaplow@chromium.org Bug: 881141 Change-Id: If8d956ad195c6cf85547f1c7172f4a35972513e4 Reviewed-on: https://chromium-review.googlesource.com/1211739 Reviewed-by: Robert Kaplow <rkaplow@chromium.org> Reviewed-by: Marijn Kruisselbrink <mek@chromium.org> Reviewed-by: Kinuko Yasuda <kinuko@chromium.org> Commit-Queue: Ben Kelly <wanderview@chromium.org> Cr-Commit-Position: refs/heads/master@{#591135} [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/content/browser/cache_storage/cache_storage_blob_to_disk_cache.cc [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/content/browser/service_worker/service_worker_installed_script_reader.cc [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/content/common/service_worker/service_worker_loader_helpers.cc [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/storage/DEPS [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/storage/browser/blob/mojo_blob_reader.cc [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/testing/variations/fieldtrial_testing_config.json [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/third_party/blink/common/blob/blob_utils.cc [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/third_party/blink/public/common/blob/blob_utils.h [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/third_party/blink/renderer/core/fetch/form_data_bytes_consumer.cc [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/third_party/blink/renderer/core/fileapi/file_reader_loader.cc [modify] https://crrev.com/25b9a8e66efcad49b254d4c0a1b09cb9a9c8a6b8/third_party/blink/renderer/modules/service_worker/fetch_respond_with_observer.cc
,
Sep 14
Note, these changes mainly help in the SW s13n mode. The legacy mode uses a different blob loading approach which gets the larger data pipes already.
,
Sep 14
,
Sep 14
I need to investigate the performance regression next week. Maybe the test is doing something that unusual or maybe there is really just increased overhead from creating larger data pipes. If its the later, maybe I can make a CL that will create the data pipes with either exactly the right capacity for the target blob or the feature-configured capacity if the blob is too big. This would be better for resource usage as well, so I should probably do this anyway.
,
Sep 18
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14d78b1f640000
,
Sep 18
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/14d78b1f640000
,
Sep 24
Marking this fixed for now. If we do a finch trial to further tune the values I will open a new launch bug for the trial. |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by wanderview@chromium.org
, Sep 6I did some initial experiments. I populated a cache_storage entry with a Response of different sizes and then ran this in the console: var c = await caches.open("foo"); var results = []; async function doTest() { r = await c.match("/"); var start = performance.now(); var t = await r.text(); results.push(performance.now() - start); } for (let i = 0; i < 20; ++i) { await doTest(); } results.join("\n") I then put the results in a spreadsheet for both our current default settings and a datapipe tuned to 512kb buffer with 32kb chunks: https://docs.google.com/spreadsheets/d/12VVrsY8zWVNhgbHo1C8-7QDMu6Y_sxvH4RElA4OIluA/edit?usp=sharing On my VM instance there are no statistically significant difference for smaller response bodies. At larger sizes like 255MB, though, there is a very strong improvement. Read times drop from a median of 2.4 seconds to 1.5 seconds. Seems worth pursuing.