A lot of errors net::ERR_CONNECTION_TIMED_OUT when accessed to local resource http://pre-test-online.sbis.ru/
Reported by
mikhail....@gmail.com,
Apr 18 2016
|
||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.7 Safari/537.36 Example URL: Steps to reproduce the problem: 1. 2. 3. What is the expected behavior? What went wrong? A lot of errors net::ERR_CONNECTION_TIMED_OUT when accessed to local resource http://pre-test-online.sbis.ru/ Did this work before? N/A Chrome version: 51.0.2704.7 Channel: dev OS Version: 6.1 (Windows 7, Windows Server 2008 R2) Flash Version: Shockwave Flash 21.0 r0
,
Apr 18 2016
Looks like the timeouts are caused by an os error of 10060 (WSAETIMEDOUT). This could either be caused by us connecting and not sending any request, or the server hanging. +rdsmith@ for socket pool debugging.
,
Apr 20 2016
I believe the timeout is in the OS connect call; specifically, I think what's happening is that we do an async dispatch of a connect event in TCPSocketWin::DoConnect(), and when that eventually completes, it completes in error (looking at the Begin/EndEvents for TCP_CONNECT_ATTEMPT). I don't think this could be caused by us connecting and not sending a request (we don't complete the OS level connection) and I don't think it could be caused by the server hanging (unless it's hanging in the kernel around completing the initial TCP connection handshake); it almost has to be an inability of the local kernel to make a connection to the remote kernel, but without being sure that it couldn't do so (otherwise it would be a "host unreachable" error). Any chance you could get a tcpdump or wireshark trace and see what the actual network traffic looks like? That would help narrow the problem further, though at this point I don't really see how it could be in Chrome. Ccing mmenke@ anyway, for socket pool expertise (though I don't think that's needed) and Windows expertise. Attaching uncompressed net-internals files (since I had a problem uncompressing it, other's might to). The specific example of this problem in the net-internals file I was looking at was the connect job at event 17649 and the matching socket at 17705.
,
Apr 21 2016
indonesia 20 Apr 2016 23.59, "rdsmith@chromium.org via Monorail" < monorail@chromium.org> menulis:
,
Apr 21 2016
>windump -w c:\capture2.cap -i 1 host 10.76.164.245
,
Apr 21 2016
Thank you for providing more feedback. Adding requester "rdsmith@chromium.org" for another review and adding "Needs-Review" label for tracking. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Apr 21 2016
Thanks for the packet capture. What is the IP address of pre-test-online.sbis.ru, just so I can scan for TCP connection attempts to it in the capture? (Fixing sheriffbot incorrect auto-assigns.)
,
Apr 22 2016
rdsmith: You can find the IP in the net-internals log. I do see a bunch of retransmitted SYNs which we never get a response from. Though in that time, we are still getting other responses from the server, which is odd. I'm not sure what to make of it. What's the server running? I wonder if it can only service so many sockets at a time or something weird like that.
,
Apr 22 2016
> What's the server running? Yes, I am sure.
,
Apr 29 2016
I believe #8 meant, what type of web server are you running?
,
May 12 2016
Nginx
,
May 12 2016
I am here attach packet capture when load this site under Firefox for you can compare behaviour with Chrome.
,
May 12 2016
Thank you for providing more feedback. Adding requester "jkarlin@chromium.org" for another review and adding "Needs-Review" label for tracking. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 12 2016
,
Jun 2 2016
,
Jun 2 2016
Both the Chrome and Firefox logs show the same thing - they're often unable to establish connections. Browser sends 3 syns, and never hears back (See, e.g., "tcp.stream eq 19" in the chrome cap, and "tcp.stream eq 230" in the FireFox one). FireFox may be using the fact that some connection requests succeed to not fail requests, and instead, have them wait behind other requests that got sockets. I'm not convinced this is a common enough case for us to care about. Something is clearly borked at another layer there.
,
Jun 17 2016
Decreasing priority based on identified packet loss. It's WAI for more timeouts to result from more packet loss.
,
Jun 17 2016
I'm going to go ahead and WontFix this, though if we notice similar situations occur with some frequency (Say, in emerging markets, on over saturated cell connections), we may want to investigate further. If we wanted to somewhat improve the case, we could just not fail socket requests if we have live connections to a server, and/or wait for all socket requests for a group to fail before failing requests. That would have a cost in terms of making failures slower, in some cases. We'd have to think carefully about tradeoffs there. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by mikhail....@gmail.com
, Apr 18 2016