New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 761483 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Improve QUIC Delay TCP logic to account for DNS resolution

Project Member Reported by xunji...@chromium.org, Sep 1 2017

Issue description

DNS resolution can take a long time. QUIC's "Delay TCP logic" delays a TCP job using past SRTT estimates, which do not account for DNS resolution time.
This leads to spurious TCP/TLS handshakes as seen in this NetLog (https://drive.google.com/a/google.com/file/d/0B_9WseIqRoaFZzFQSC1MZGpmb1k/view?usp=sharing)


Since TCP and QUIC jobs have their own DNS resolution paths, I wonder whether we could unifying those two. If that's not feasible, we can look into only resuming TCP job after QUIC job has finished DNS resolution.

 
Summary: Improve QUIC Delay TCP logic to account for DNS resolution (was: Improve QUIC Delay TCP logic to take account of DNS resolution)
This is filed by a Cronet consumer. The behavior is unexpected and I think it's worth looking into.

Comment 2 by rch@chromium.org, Sep 1 2017

Nice find! We (well, you, iirc) unified the proxy resolution logic into the JobController so that neither job starts until proxy resolution is complete. It seems plausible to me that we could do something similar here for host resolution. I think plumbing the host resolution results down into the socket pool layer might be tricky, but plumbing it into QUIC should be easy. So we could possibly resolve it there, and then start the jobs.

Alternatively, the transport connect job has a host_resolution_callback() that it invokes (on the Job, iirc) when it resolves the host. This callback currently checks to see if there is a SpdySession which has been created since the Transport job was created. We could extend/change this callback to also check QuicSessions, though I've not thought too deeply about how challenging this might be to implement.

We could probably plumb a similar callback down the the QuicStreamFactory Job to call back up when the host is resolved, and start/un-pause the TCP job then. (The Job would probably ping the N Requests, and each Request could ping the associated Job)

Of course, the simplest thing would be to increase the delay to compensate for the DNS resolution time. Though the attached net-internals the DNS resolution takes 6 seconds which is probably larger than any value we'd be likely to pick.
Owner: xunji...@chromium.org
Status: Assigned (was: Untriaged)
Thanks, Ryan. I am also in favor of separating out DNS resolution of TCP ConnectJob. This will help us to address Issue 761487 (where the TCP ConnectJob's 250ms timeout doesn't account for DNS resolution).

Since we are on the same page, I will mark this as triaged and assign myself. I will see if we can do approach #1 without requiring too much modification on socket pools.

I started a discussion thread on net-dev@: https://groups.google.com/a/chromium.org/d/msg/net-dev/PwMhnUDK8CE/qBh-FfA0BgAJ

Comment 5 by rch@chromium.org, Nov 10 2017

xunjieli: Can you summarize the conclusion of that thread here on the bug? Do you think it's in a state where there is a straightforward implementation?
Cc: xunji...@chromium.org
Owner: ----
Status: Available (was: Assigned)
Thanks, Ryan. Yes, in the discussion thread, we arrived at a more-or-less straightforward implementation. To summarize, making QUIC's "delay TCP" logic aware of DNS is worth investigating. The underlying problem is that DNS is slow, which is a problem that the network stack team is currently addressing (e.g. through DNS over HTTP, AsyncDNS). So this bug is of lower priority.

To investigate further, the approach is have a signaling interface for Quic to unpause TCP job if DNS is taking a long time (This is also the idea proposed by rch@ in #2). Gather UMA data to see how much a difference this improvement makes, and whether to roll out this new behavior by default.

(Un-assigning myself. I didn't find time to work on this)

Comment 7 by rch@chromium.org, Nov 13 2017

Owner: wangyix@chromium.org
wangix: this would be great to get working
Status: Assigned (was: Available)
Status: Started (was: Assigned)
Project Member

Comment 10 by bugdroid1@chromium.org, Dec 21 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/7c5d11a8ac216b6a139914029edf0a8b338b65d8

commit 7c5d11a8ac216b6a139914029edf0a8b338b65d8
Author: Yixin Wang <wangyix@chromium.org>
Date: Thu Dec 21 19:50:54 2017

Improve QUIC/TCP race logic to not start TCP job until after QUIC host resolution

Currently, HttpStreamFactoryImpl::JobController races a QUIC and a TCP job.

Modify HttpStreamFactoryImpl::JobController to only starts the TCP job after the QUIC job
finishes DNS resolution.

If the QUIC job doesn't succeed/fail immediately, then HttpStreamFactoryImpl::JobController
will start the TCP job after a set delay.However, if the QUIC job is taking too long because
DNS resolution is slow, then it makes little sense to start the TCP job since it too has to
do DNS resolution. Now, JobController will wait for the QUIC job to finish DNS resolution
before starting the delay timer, after which the TCP job will start.

Bug:  761483 
Change-Id: I61cc77073e7ed8f9261ac7fed0f5e0510bbb4aad
Reviewed-on: https://chromium-review.googlesource.com/821851
Reviewed-by: Ryan Hamilton <rch@chromium.org>
Commit-Queue: Yixin Wang <wangyix@chromium.org>
Cr-Commit-Position: refs/heads/master@{#525777}
[modify] https://crrev.com/7c5d11a8ac216b6a139914029edf0a8b338b65d8/net/http/http_stream_factory_impl_job.cc
[modify] https://crrev.com/7c5d11a8ac216b6a139914029edf0a8b338b65d8/net/http/http_stream_factory_impl_job.h
[modify] https://crrev.com/7c5d11a8ac216b6a139914029edf0a8b338b65d8/net/http/http_stream_factory_impl_job_controller_unittest.cc
[modify] https://crrev.com/7c5d11a8ac216b6a139914029edf0a8b338b65d8/net/quic/chromium/quic_network_transaction_unittest.cc
[modify] https://crrev.com/7c5d11a8ac216b6a139914029edf0a8b338b65d8/net/quic/chromium/quic_stream_factory.cc
[modify] https://crrev.com/7c5d11a8ac216b6a139914029edf0a8b338b65d8/net/quic/chromium/quic_stream_factory.h
[modify] https://crrev.com/7c5d11a8ac216b6a139914029edf0a8b338b65d8/net/quic/chromium/quic_stream_factory_test.cc

Status: Fixed (was: Started)

Sign in to add a comment