Possible performance limitation for uploads over QUIC |
||
Issue descriptionLack of buffering QuicChromiumClientStream::WriteStreamData() might be hurting uploads performance As discussed with rch@. Contrasting with TCP which has a kernel socket buffer, and associated thresholds for accepting new data into the buffer. Chromium QUIC code has a pretty minimal amount of buffering. Specifically QuicChromiumClientStream::WriteStreamData() returns ERR_IO_PENDING the first time WriteOrBufferData() indicates data was buffered. It's quite likely that the size of data blocks supplied to WriteStreamData() are small relative to a typical TCP kernel socket buffer. In the TCP case, the buffer serves to reduce user/kernel mode switches, saving CPU. In the QUIC case, there is an analog in thread switches between source of the data (e.g. UploadDataStream) and the network thread. Also since UploadDataStream uses a fairly naive scheme of alternating between reads and writes, the lack of buffering in QUIC could limit the ability of UploadDataStream to amortize the impact of slow reads from storage.
,
Apr 6 2017
For multiple reasons, I've been wanting to move data ownership(the equivalent of a kernel send buffer) into the QUIC streams. This would allow larger writes from applications, which are typically more efficient, and could even enable zero-copy( prior to crypto). This would mean StreamFrames no longer own data, which would remove one extra copy and allocation from the send path and would be great for the future of multipath. So ideally I'd like to fix this on the client and server side, since I believe it can benefit both.
,
Apr 6 2017
For buffering, I was thinking the overall buffer would be enough for any outstanding unacked data + the size of the last application write. ie: we'd consume whatever data the application had for us all at once, and then tell it we're full. I think this is a bit like TCP lowat, but with a very small low water mark?
,
Apr 6 2017
As rch@ pointed out, this line of code is pertinent: https://codesearch.chromium.org/chromium/src/net/quic/chromium/quic_http_stream.cc?rcl=e6a1fd7a7f2fc50b1fa4311ebd75d43195d69d00&l=294 So, it would seem that ~14KB is the upper bound currently on data buffered in the QUIC write path.
,
Apr 7 2017
Early results, it seems like this can improve throughput of CPU bound uploads by about 20%. Based on uploading a 5GiB file to drive from my workstation, 5 uploads each at buffer sizes of 10, 128, 256, 512, 1024 * kMaxSizePacket. In these tests, the best improvement was reached with 256 * kMaxSizePacket. The same code space likely has another issue if/when there are storage latencies that would benefit from more overlap with network writes. Adding a kind of watermark to trigger stream writes earlier would help there.
,
Apr 7 2017
Excellent progress! As a straw-man for the overlap, I would suggest we invoke callback_ from OnWrite() when the buffer drops below 50%. (Not necessarily 50% of capacity, but 50% of whatever it was at when it was last filled)
,
Apr 7 2017
Awesome find!
,
Apr 7 2017
Sounds like great progress. Can you summarize what the differences in bandwidth were at different sizes? Can you include 64 as a size as well?
,
Apr 7 2017
Average Bandwidths (Mbps) were: 286, 337, 356, 339, 339 I'll get numbers for 64x tomorrow.
,
Apr 7 2017
Did some more runs, including 64x. 64x did quite well, managing average 343 Mpbs, and 17% speedup over 10x. 64x and above consistently outperform the status quo 10x by at least 15%. The gains beyond 64x have a good deal of variance. 256 looks like the sweet spot, but not with a lot of confidence.
,
Apr 11 2017
,
Jul 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d4619709908bcfb3746813b656ff80dc242a9388 commit d4619709908bcfb3746813b656ff80dc242a9388 Author: Charles 'Buck' Krasic <ckrasic@chromium.org> Date: Fri Jul 21 21:52:55 2017 QUIC - increase buffer size to improve upload performance. Increase the size of QuicHttpStream::raw_request_body_buf_ based on the size of the body. For large bodies and fast networks, this give a nice reduction in CPU (or boost throughput if CPU bound) (~10-20%). R=rch@chromium.org Bug: 709144 Change-Id: If4fc69f9f89866b75b6d09d728e53a7bfa2054e4 Reviewed-on: https://chromium-review.googlesource.com/581877 Reviewed-by: Buck Krasic <ckrasic@chromium.org> Reviewed-by: Ryan Hamilton <rch@chromium.org> Commit-Queue: Buck Krasic <ckrasic@chromium.org> Cr-Commit-Position: refs/heads/master@{#488765} [modify] https://crrev.com/d4619709908bcfb3746813b656ff80dc242a9388/net/quic/chromium/quic_http_stream.cc |
||
►
Sign in to add a comment |
||
Comment 1 by ckrasic@chromium.org
, Apr 6 2017