Fetch large files in parallel via multiple TCP streams |
|||||
Issue descriptionThere were multiple reports of extremely slow CIPD file fetches from non-ideal networks. At the same time gsutil worked pretty well. There are two significant differences between gsutil and cipd: 1. gsutil opens N TCP streams and fetches the file from multiple offsets in parallel (CIPD doesn't). 2. gsutil uses python's TCP library (and likely HTTP 1.1), cipd is in Go (and uses HTTP 2.0). Ideally, we should learn to reproduce this behavior (perhaps using a crappy network simulator) and then try to fix it. One concern is that with HTTP 2.0 there's a high chance "multiple" HTTP streams will in fact share one TCP connection :)
,
Sep 26
Downloading files using Chrome from exact same signed URLs as used by CIPD produce better speed. (Presumably Chrome doesn't parallelize requests). CIPD uses io.Copy(...) when reading from TCP connection. Maybe 32Kb buffer used by io.Copy is a limiting factor here.
,
Sep 26
This is reproducible using e.g. "Very Bad Network" profile of OSX's Network Link Conditioner. Fetching 8 MB via Chrome takes 1:30, via CIPD - more than twice longer. I'll try different experiments now.
,
Sep 27
'curl' is as slow as CIPD (== 2x slower than Chrome). I suspect Chrome does parallel download too (with 2 streams).
,
Sep 27
The following revision refers to this bug: https://chromium.googlesource.com/infra/luci/luci-go.git/+/c1d2d308e2ee636d2ddd5d2f09f888c02b65e22a commit c1d2d308e2ee636d2ddd5d2f09f888c02b65e22a Author: Vadim Shtayura <vadimsh@chromium.org> Date: Thu Sep 27 00:48:47 2018 [cipd] Add simple download speed reporter. Looks like this: [P50311 17:31:45.574 client.go:1380 I] cipd: resolving fetch URL for ... [P50311 17:31:45.637 storage.go:308 I] cipd: initiating the fetch [P50311 17:31:46.223 storage.go:275 I] cipd: about to fetch 935.2 MB [P50311 17:31:46.223 storage.go:256 I] cipd: fetching - 0% (?? KB/s) [P50311 17:31:51.225 storage.go:256 I] cipd: fetching - 2% (5524.9 KB/s) [P50311 17:31:56.227 storage.go:256 I] cipd: fetching - 6% (6502.5 KB/s) [P50311 17:32:01.228 storage.go:256 I] cipd: fetching - 9% (6469.5 KB/s) ... R=iannucci@chromium.org, nodir@chromium.org BUG=889653 Change-Id: Ifa441d5c6d768c617095840928210e3aa0a6fa28 Reviewed-on: https://chromium-review.googlesource.com/1247581 Reviewed-by: Robbie Iannucci <iannucci@chromium.org> Commit-Queue: Vadim Shtayura <vadimsh@chromium.org> [modify] https://crrev.com/c1d2d308e2ee636d2ddd5d2f09f888c02b65e22a/cipd/client/cipd/client.go [modify] https://crrev.com/c1d2d308e2ee636d2ddd5d2f09f888c02b65e22a/cipd/client/cipd/storage.go
,
Sep 27
I've confirmed with aria2c tool that downloading file in multiple (4) streams indeed speeds up download 2x on "Very Bad Network" profile, compared to a single stream (like cipd does). On good WiFi network it has no effect (slightly negative in fact). Haven't tried on a high speed ethernet connection yet. One huge downside of the parallel fetch is that we'll have to do separate pass over the file to calculate its SHA256, since we won't be able to do it on the fly when fetching. For large files, it may be quite noticeable (reading 1GB from disk is no joke). So on fast networks, it would be more efficient to stick with 1 stream and on-the-fly hash calculation. On slow networks, fetching in ~4 streams is faster. Detecting network speed is not really trivial, and takes time (if using some probes). One option is to start fetching in 1 stream, and then spawn more if the speed is less than X. This is a bit complicated to implement though :(
,
Sep 27
Another avenue for parallelization is to fetch different packages in parallel. They are currently fetched sequentially, with assumption that single stream is able to saturate the network link (which turned out to be wrong on slow networks). Downsides: 1. Won't help when fetching 1 package. This happens quite often (e.g. we often update only 1 software component, not all of them at once). 2. (Speculation) Writing to multiple files in parallel may decrease disk IO.
,
Sep 28
,
Oct 18
,
Oct 18
,
Oct 30
I hit this again this evening and did some more investigation. From home I get better performance with a wired machine than a wifi machine, but in both cases they achieve well below line rate. Chrome achieves line rate against the same downloads. curl with tcp fast open and http2 forced on achieves line rate about 10% of the time. Starting to suspect there's some configuration issue at the GCS side that is making this issue more pronounced. Good runs >300mbps. Bad runs <500kbps. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by vadimsh@chromium.org
, Sep 26