New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 604361 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Jun 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

A lot of errors net::ERR_CONNECTION_TIMED_OUT when accessed to local resource http://pre-test-online.sbis.ru/

Reported by mikhail....@gmail.com, Apr 18 2016

Issue description

UserAgent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.7 Safari/537.36

Example URL:

Steps to reproduce the problem:
1. 
2. 
3. 

What is the expected behavior?

What went wrong?
A lot of errors net::ERR_CONNECTION_TIMED_OUT when accessed to local resource http://pre-test-online.sbis.ru/

Did this work before? N/A 

Chrome version: 51.0.2704.7  Channel: dev
OS Version: 6.1 (Windows 7, Windows Server 2008 R2)
Flash Version: Shockwave Flash 21.0 r0
 
net-internals-log.7z
611 KB Download
in Firefox 48.0a1 I didn't see this problem
Cc: rdsmith@chromium.org
Looks like the timeouts are caused by an os error of 10060 (WSAETIMEDOUT). This could either be caused by us connecting and not sending any request, or the server hanging.

+rdsmith@ for socket pool debugging. 
Cc: mmenke@chromium.org
Labels: Needs-Feedback
I believe the timeout is in the OS connect call; specifically, I think what's happening is that we do an async dispatch of a connect event in TCPSocketWin::DoConnect(), and when that eventually completes, it completes in error (looking at the Begin/EndEvents for TCP_CONNECT_ATTEMPT).  I don't think this could be caused by us connecting and not sending a request (we don't complete the OS level connection) and I don't think it could be caused by the server hanging (unless it's hanging in the kernel around completing the initial TCP connection handshake); it almost has to be an inability of the local kernel to make a connection to the remote kernel, but without being sure that it couldn't do so (otherwise it would be a "host unreachable" error).

Any chance you could get a tcpdump or wireshark trace and see what the actual network traffic looks like?  That would help narrow the problem further, though at this point I don't really see how it could be in Chrome.

Ccing mmenke@ anyway, for socket pool expertise (though I don't think that's needed) and Windows expertise.  

Attaching uncompressed net-internals files (since I had a problem uncompressing it, other's might to).  The specific example of this problem in the net-internals file I was looking at was the connect job at event 17649 and the matching socket at 17705.

net-internals-log-locla.json
12.6 MB View Download

Comment 4 by ssdd98...@gmail.com, Apr 21 2016

indonesia
20 Apr 2016 23.59, "rdsmith@chromium.org via Monorail" <
monorail@chromium.org> menulis:
>windump -w c:\capture2.cap -i 1 host 10.76.164.245
chrome_screen.png
300 KB View Download
capture2.cap
1.7 MB Download
Project Member

Comment 6 by sheriffbot@chromium.org, Apr 21 2016

Labels: -Needs-Feedback Needs-Review
Owner: rdsmith@chromium.org
Thank you for providing more feedback. Adding requester "rdsmith@chromium.org" for another review and adding "Needs-Review" label for tracking.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Needs-Review
Owner: ----
Thanks for the packet capture.  What is the IP address of pre-test-online.sbis.ru, just so I can scan for TCP connection attempts to it in the capture?

(Fixing sheriffbot incorrect auto-assigns.)

rdsmith: You can find the IP in the net-internals log.

I do see a bunch of retransmitted SYNs which we never get a response from. Though in that time, we are still getting other responses from the server, which is odd. I'm not sure what to make of it.

What's the server running? I wonder if it can only service so many sockets at a time or something weird like that.
> What's the server running?
Yes, I am sure.
Labels: Needs-Feedback
I believe #8 meant, what type of web server are you running? 
Nginx
I am here attach packet capture when load this site under Firefox for you can compare behaviour with Chrome.
firefox.cap
2.4 MB Download
Project Member

Comment 13 by sheriffbot@chromium.org, May 12 2016

Labels: -Needs-Feedback Needs-Review
Owner: jkarlin@chromium.org
Thank you for providing more feedback. Adding requester "jkarlin@chromium.org" for another review and adding "Needs-Review" label for tracking.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Owner: ----
my.har
4.0 MB Download
Both the Chrome and Firefox logs show the same thing - they're often unable to establish connections.  Browser sends 3 syns, and never hears back (See, e.g., "tcp.stream eq 19" in the chrome cap, and "tcp.stream eq 230" in the FireFox one).  FireFox may be using the fact that some connection requests succeed to not fail requests, and instead, have them wait behind other requests that got sockets.

I'm not convinced this is a common enough case for us to care about.  Something is clearly borked at another layer there.
Labels: -Pri-2 Pri-3
Decreasing priority based on identified packet loss.  It's WAI for more timeouts to result from more packet loss.
Cc: bmcquade@chromium.org
Status: WontFix (was: Unconfirmed)
I'm going to go ahead and WontFix this, though if we notice similar situations occur with some frequency (Say, in emerging markets, on over saturated cell connections), we may want to investigate further.

If we wanted to somewhat improve the case, we could just not fail socket requests if we have live connections to a server, and/or wait for all socket requests for a group to fail before failing requests.

That would have a cost in terms of making failures slower, in some cases.  We'd have to think carefully about tradeoffs there.

Sign in to add a comment