New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 271766 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Aug 2013
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

tcp fast open causes hang on ChromeOS

Project Member Reported by sonnyrao@chromium.org, Aug 12 2013

Issue description

Chrome Version: R30
Chrome OS Version: R30
Chrome OS Platform: all

note: fast-open requires a flag to be enabled, it's not on by default

From email:

Hi Chrome Folks,

I think I found the root cause of Chrome hanging on ChromeOS with fast
open, when SYN-data is dropped in the network. Since ChromeOS actually
configures NAT locally, the kernel has a bug that will consider a
SYN-data, SYN retransmit sequence to be an invalid sequence, and drop
the connection (hence the SYN retransmit). This causes all Fast Open
connection to timeout and hang the browser.

I am still in the process of nailing the exact bug code, but want to
file a CrOS bug to track this. How do I do that?


 
Labels: Iteration-88
Status: Started
Project Member

Comment 2 by bugdroid1@chromium.org, Aug 13 2013

Project: chromiumos/third_party/kernel-next
Branch : chromeos-3.8
Author : Sonny Rao <sonnyrao@chromium.org>
Commit : 771fe3c60b7eead2d30eb693714c2d0662da2f2c

Code Review +1: Yuchung Cheng
Code Review +2: Paul Stewart
Verified    +1: Sonny Rao
Change-Id     : Ib6519bc715d7db114d8294c34ee4f94a548aa322
Reviewed-at   : https://gerrit.chromium.org/gerrit/65637

BACKPORT: netfilter: nf_conntrack: fix tcp_in_window for Fast Open

Currently the conntrack checks if the ending sequence of a packet
falls within the observed receive window. However it does so even
if it has not observe any packet from the remote yet and uses an
uninitialized receive window (td_maxwin).

If a connection uses Fast Open to send a SYN-data packet which is
dropped afterward in the network. The subsequent SYNs retransmits
will all fail this check and be discarded, leading to a connection
timeout. This is because the SYN retransmit does not contain data
payload so

end == initial sequence number (isn) + 1
sender->td_end == isn + syn_data_len
receiver->td_maxwin == 0

The fix is to only apply this check after td_maxwin is initialized.

Reported-by: Michael Chan <mcfchan@stanford.edu>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

currently accepted on patchwork:
http://patchwork.ozlabs.org/patch/266243/

BUG= chromium:271766 
TEST=enable tcp-fast open in about:flags and verify there aren't hangs
on google.com

Commit-Queue: Sonny Rao <sonnyrao@chromium.org>

M  net/netfilter/nf_conntrack_proto_tcp.c
Status: Fixed

Comment 4 by krisr@chromium.org, Aug 13 2013

Status: Verified
Project Member

Comment 5 by bugdroid1@chromium.org, Nov 21 2013

Project: chromiumos/third_party/kernel-next
Branch : chromeos-3.8
Author : Yuchung Cheng <ycheng@google.com>
Commit : 40711f2c7e71720ef2af81f9d4b7c5aec56f5f45

Code-Review  0 : Sonny Rao, Yuchung Cheng, chrome-internal-fetch
Code-Review  +2: Paul Stewart
Commit-Queue 0 : Paul Stewart, Yuchung Cheng, chrome-internal-fetch
Commit-Queue +1: Sonny Rao
Verified     0 : Paul Stewart, Yuchung Cheng, chrome-internal-fetch
Verified     +1: Sonny Rao
Change-Id      : Ic039e2545810ffae77f67da543bf541c4de9f36b
Reviewed-at    : https://chromium-review.googlesource.com/176673

UPSTREAM: tcp: temporarily disable Fast Open on SYN timeout

Fast Open currently has a fall back feature to address SYN-data being
dropped but it requires the middle-box to pass on regular SYN retry
after SYN-data. This is implemented in commit aab487435 ("net-tcp:
Fast Open client - detecting SYN-data drops")

However some NAT boxes will drop all subsequent packets after first
SYN-data and blackholes the entire connections.  An example is in
commit 356d7d8 "netfilter: nf_conntrack: fix tcp_in_window for Fast
Open".

The sender should note such incidents and fall back to use the regular
TCP handshake on subsequent attempts temporarily as well: after the
second SYN timeouts the original Fast Open SYN is most likely lost.
When such an event recurs Fast Open is disabled based on the number of
recurrences exponentially.

BUG= chromium:271766 
TEST=boot, log in, enable TCP fast open in flags, restart browser
go to google.com and other sites, no crashes

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit c968601d174739cb1e7100c95e0eb3d2f7e91bc9)
from https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git

M  net/ipv4/tcp_metrics.c
M  net/ipv4/tcp_timer.c
Project Member

Comment 6 by bugdroid1@chromium.org, Nov 21 2013

Project: chromiumos/third_party/kernel-next
Branch : chromeos-3.8
Author : Eric Dumazet <edumazet@google.com>
Commit : ac502c6aa7d4fddf53aef3ce1ace65bbfb691416

Code-Review  0 : Eric Dumazet, Sonny Rao, Yuchung Cheng, chrome-internal-fetch
Code-Review  +2: Paul Stewart
Commit-Queue 0 : Eric Dumazet, Paul Stewart, Yuchung Cheng, chrome-internal-fetch
Commit-Queue +1: Sonny Rao
Verified     0 : Eric Dumazet, Paul Stewart, Yuchung Cheng, chrome-internal-fetch
Verified     +1: Sonny Rao
Change-Id      : I6baa3a70403e520a817d455167486b8222eabf25
Reviewed-at    : https://chromium-review.googlesource.com/176846

UPSTREAM: net-tcp: fix panic in tcp_fastopen_cache_set()

We had some reports of crashes using TCP fastopen, and Dave Jones
gave a nice stack trace pointing to the error.

Issue is that tcp_get_metrics() should not be called with a NULL dst

BUG= chromium:271766 
TEST=boot, log in, enable TCP fast open in flags, restart browser
go to google.com and other sites, no crashes

(accepted upstream http://patchwork.ozlabs.org/patch/291068/ )
Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>

M  net/ipv4/tcp_metrics.c
Project Member

Comment 7 by bugdroid1@chromium.org, Dec 14 2013

Project: chromiumos/third_party/kernel-next
Branch : chromeos-3.10
Author : Eric Dumazet <edumazet@google.com>
Commit : 7865f566cfbe71803235c3a8020121ab5755ce07

Code-Review  0 : Benson Leung, Sonny Rao, chrome-internal-fetch
Code-Review  +2: Paul Stewart
Commit-Queue 0 : Paul Stewart, Sonny Rao, chrome-internal-fetch
Commit-Queue +1: Benson Leung
Verified     0 : Paul Stewart, Sonny Rao, chrome-internal-fetch
Verified     +1: Benson Leung
Change-Id      : I6d0b622341e28c03167ce3b4b582d0a03fa75eef
Reviewed-at    : https://chromium-review.googlesource.com/178652

UPSTREAM: net-tcp: fix panic in tcp_fastopen_cache_set()

We had some reports of crashes using TCP fastopen, and Dave Jones
gave a nice stack trace pointing to the error.

Issue is that tcp_get_metrics() should not be called with a NULL dst

BUG= chromium:271766 
TEST=boot, log in, enable TCP fast open in flags, restart browser
go to google.com and other sites, no crashes

(accepted upstream

net/ipv4/tcp_metrics.c
Project Member

Comment 8 by bugdroid1@chromium.org, Dec 14 2013

Project: chromiumos/third_party/kernel-next
Branch : chromeos-3.10
Author : Yuchung Cheng <ycheng@google.com>
Commit : 2207f189b0e5121249b0e78c5585869314a7d7d6

Code-Review  0 : Benson Leung, Sonny Rao, Yuchung Cheng, chrome-internal-fetch
Code-Review  +2: Paul Stewart
Commit-Queue 0 : Paul Stewart, Sonny Rao, Yuchung Cheng, chrome-internal-fetch
Commit-Queue +1: Benson Leung
Verified     0 : Paul Stewart, Sonny Rao, Yuchung Cheng, chrome-internal-fetch
Verified     +1: Benson Leung
Change-Id      : I537eb5c758bee5bda720d9381e0f8fae340c2e8f
Reviewed-at    : https://chromium-review.googlesource.com/178651

UPSTREAM: tcp: temporarily disable Fast Open on SYN timeout

Fast Open currently has a fall back feature to address SYN-data being
dropped but it requires the middle-box to pass on regular SYN retry
after SYN-data. This is implemented in commit aab487435 ("net-tcp:
Fast Open client - detecting SYN-data drops")

However some NAT boxes will drop all subsequent packets after first
SYN-data and blackholes the entire connections.  An example is in
commit 356d7d8 "netfilter: nf_conntrack: fix tcp_in_window for Fast
Open".

The sender should note such incidents and fall back to use the regular
TCP handshake on subsequent attempts temporarily as well: after the
second SYN timeouts the original Fast Open SYN is most likely lost.
When such an event recurs Fast Open is disabled based on the number of
recurrences exponentially.

BUG= chromium:271766 
TEST=boot, log in, enable TCP fast open in flags, restart browser
go to google.com and other sites, no crashes

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit c968601d174739cb1e7100c95e0eb3d2f7e91bc9)
from

net/ipv4/tcp_metrics.c
net/ipv4/tcp_timer.c

Sign in to add a comment