Issue metadata
Sign in to add a comment
|
Migrate after successful probing may cause a connection being closed with INTERNAL_ERROR |
||||||||||||||||||||||
Issue description
Recent data collected from cronet applications indicates that there's higher rate of INTERNAL_ERROR in experiment turning on connection migration on path degrading.
This is caused by QuicChromiumClientSession::OnProbeNetworkSucceeded which is called on the stack of QuicConnection::ProcessUdpPacket() when processing the probing response packet on the new path. QCCS::OnProbeNetworkSucceeded will call QCCS::MigrateToSocket, which eventually calls QuicChromiumPacketReader::StartReading. If there are synchronous reads, this will then run into a corner case: the outmost flusher is living on the stack of ProcessUdpPacket for the initial probing response packet, and there are new packets being process before the stack returns. If an ACK is queued, by some of the synchronous read, and a subsequent synchronous read will blow up in QuicConnection::OnUnauthenticatedHeader()[1].
So the stack looks like follows:
QuicChromiumPacketReader::ProcessReadResult() for packet 1: probing response in probing reader
QuicConnection::ProcessUdpPacket() for packet 1
ScopedPacketFlusher flusher <------ outmost flusher
QuicFramer::ProcessPacket for packet 1
QuicChromiumClientSession::OnConnectivityProbeReceived
...
QuicChromiumClientSession::OnProbeNetworkSucceeded
QuicChromiumClientSession::MigrateToSocket()
QuicChromiumPacketReader::StartReading return sync'ly
QuicChromiumPacketReader::ProcessReadResult for packet 2
...
ScopedPacketFlusher flusher <--- inner flusher, no flush
.... cause ACK frame queued
QuicChromiumPacketReader::ProcessReadResult retunr for packet 2
QuicChromiumPacketReader::StartReading return sync'ly
QuicChromiumPacketReader::ProcessReadResult for packet 3
...
QuicConnection::OnUnauthenticatedHeader blows up for packet 3
< return Quic
[1]https://cs.chromium.org/chromium/src/net/third_party/quic/core/quic_connection.cc?gsn=ProcessUnauthenticatedHeader&g=0&l=684
,
Aug 10
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d3d5f5023f75dfda721f62fb40e5ee80e3ce16e3 commit d3d5f5023f75dfda721f62fb40e5ee80e3ce16e3 Author: Zhongyi Shi <zhongyi@chromium.org> Date: Fri Aug 10 00:22:22 2018 Do not force a started reader to start reading when migrate to a probing socket. A QuicChromiumPacketReader::StartReading() should only be called once. It will read again automatically when the previous read result has been processed. Calling StartReading on a started reader may break the invariant and cause bugs like http://crbug.com/872011 Bug: 872011 Change-Id: I6dde0a3daacb9ed9b6d7d4781de3c132e0a6aad7 Reviewed-on: https://chromium-review.googlesource.com/1166340 Commit-Queue: Zhongyi Shi <zhongyi@chromium.org> Reviewed-by: Ryan Hamilton <rch@chromium.org> Cr-Commit-Position: refs/heads/master@{#581977} [modify] https://crrev.com/d3d5f5023f75dfda721f62fb40e5ee80e3ce16e3/net/quic/quic_chromium_client_session.cc [modify] https://crrev.com/d3d5f5023f75dfda721f62fb40e5ee80e3ce16e3/net/quic/quic_chromium_client_session.h [modify] https://crrev.com/d3d5f5023f75dfda721f62fb40e5ee80e3ce16e3/net/quic/quic_chromium_client_session_test.cc [modify] https://crrev.com/d3d5f5023f75dfda721f62fb40e5ee80e3ce16e3/net/quic/quic_stream_factory_test.cc
,
Aug 14
Change on Comment 2 landed in 70.0.3518.0. The fix will be picked up by Cronet release.
,
Aug 16
The NextAction date has arrived: 2018-08-16
,
Aug 16
Should we merge back to M69?
,
Aug 16
Good point! Cronet releases really frequently, but not the gmscore?
,
Aug 17
,
Aug 17
This bug requires manual review: M69 has already been promoted to the beta branch, so this requires manual review Please contact the milestone owner if you have questions. Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Aug 17
Pls apply appropriate OSs label. And also provide merge justification, why it is needed and how safe it is to merge to M69 this late in release cycle.
,
Aug 17
The code technically only affects Android as Android is the only platform supports connection migration. And this change only potentially affects Cronet users who are enrolled in the experiment. This change addresses a corner case and should not cause new crash as QUIC connection will close the connection if there're errors during process read result. We'd like to request a fix to be merged to M69 as we expect this feature will be released to GMSCore users where gmscore release cycle is relatively long.
,
Aug 18
Approving for merge to M69, branch 3497.
,
Aug 21
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible! If all merges have been completed, please remove any remaining Merge-Approved labels from this issue. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Aug 21
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/527f9e15f6942c29270888f5cf415911917c6df2 commit 527f9e15f6942c29270888f5cf415911917c6df2 Author: Zhongyi Shi <zhongyi@chromium.org> Date: Tue Aug 21 23:53:50 2018 Do not force a started reader to start reading when migrate to a probing socket. A QuicChromiumPacketReader::StartReading() should only be called once. It will read again automatically when the previous read result has been processed. Calling StartReading on a started reader may break the invariant and cause bugs like http://crbug.com/872011 (cherry picked from commit d3d5f5023f75dfda721f62fb40e5ee80e3ce16e3) Bug: 872011 Change-Id: I6dde0a3daacb9ed9b6d7d4781de3c132e0a6aad7 Reviewed-on: https://chromium-review.googlesource.com/1166340 Commit-Queue: Zhongyi Shi <zhongyi@chromium.org> Reviewed-by: Ryan Hamilton <rch@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#581977} Reviewed-on: https://chromium-review.googlesource.com/1184305 Reviewed-by: Zhongyi Shi <zhongyi@chromium.org> Cr-Commit-Position: refs/branch-heads/3497@{#758} Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753} [modify] https://crrev.com/527f9e15f6942c29270888f5cf415911917c6df2/net/quic/chromium/quic_chromium_client_session.cc [modify] https://crrev.com/527f9e15f6942c29270888f5cf415911917c6df2/net/quic/chromium/quic_chromium_client_session.h [modify] https://crrev.com/527f9e15f6942c29270888f5cf415911917c6df2/net/quic/chromium/quic_chromium_client_session_test.cc [modify] https://crrev.com/527f9e15f6942c29270888f5cf415911917c6df2/net/quic/chromium/quic_stream_factory_test.cc
,
Oct 24
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by zhongyi@chromium.org
, Aug 7