New issue
Advanced search Search tips

Issue 717083 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner: ----
Closed: May 2017
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug-Regression



Sign in to add a comment

Chromium prematurely kills http2 connection

Reported by kkoopora...@gmail.com, May 1 2017

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/603.2.4 (KHTML, like Gecko) Version/10.1.1 Safari/603.2.4

Example URL:
https://psychonautwiki.org/wiki/Main_Page

Steps to reproduce the problem:
1. Open URL
2. Failure

What is the expected behavior?
Should load the page

What went wrong?
We have detected over 841 premature http2 disconnects (starting 36 hours ago) with a user-agent newer than current Canary where a request was successfully sent.

Please see the network dump -- the connection seems to close after the request is sent by the browser.

Did this work before? Yes 60.0.3083.0 (Canary), also all other current Chrome deployments (dev, beta, et al.)

Chrome version: 60.0.3086.0 (Chromium)  Channel: n/a
OS Version: OS X 10.12.5
Flash Version: 

I am aware this is an issue about Chromium, but our users are highly paranoid and even though Chromium users make up only a small pat of our userbase, I would like to investigate this issue. Furthermore, I had observed this issue before on a Chrome Canary version earlier last month, but it fixed after a few releases. I am writing this to check wether this issue is a regression introduced by accident or by more stricter enforcement of http2 (or similar).
 
Components: -Internals>Network Internals>Network>HTTP2
Components: Internals>Network>SSL
Looks like we're receiving an SSL3_RT_ALERT and then the server hangs up on us:

t=135 [st=131]        SSL_ALERT_RECEIVED
                      --> hex_encoded_bytes =
                        01 00                                              . 
t=135 [st=131]        SSL_SOCKET_BYTES_RECEIVED
                      --> byte_count = 0

Tentatively adding SSL label.
Oops, the server doesn't hang up on us - second is what we decode from the alert, so I assume the 0 bytes means we interpret the alert as a 0-byte read (i.e., a close socket event)
Correct. The server is sending the alert - 01 = alert, 00 = close notify.

The server is hanging up on us. Sounds like a server issue.
Sending an alert is hanging up on us. That's how you cleanly close an TLS connection. :-P

That Chromium build looks like it is enabling all experiments, which is a developer configuration, not an end-user one. Looks like the server is speaking TLS 1.3. On the client side, that's being experimented with, so that would explain why you observed it on and off.

However, the net-internals log shows it is a server problem. If you have TLS 1.3 support, either your server software vendor, or your host, is deploying some new stuff. I'd suggest talking to them and seeing if they have a bug in their TLS 1.3 code.
Status: WontFix (was: Unconfirmed)
I understand and will raise this issue with the developer. Our deployment uses a dockerized H2O server for http2 termination. Interestingly, the same (identical) software is used for https://r1.apx.pub and https://sly.mn (which work) - also, the page works in Canary, Dev, Beta and Stable, which is the reason I raised this issue at all.

Whatever made this specific server response a hard failure -- is that a new revision or why does it fail now? The only thing I'd like to know is whether this is a change that makes Chrome more strict or if this is an intentional change in any way.

-- Kenan
We're just getting a close_notify alert from the server. That means they're shutting the connection off on us and has since before Chrome existed. There's no recent change there.

If you run with --ssl-version-max=tls1.2 and --ssl-version-max=tls1.3, does that change whether you can reproduce the issue? That might pinpoint where on the server end (http2 or TLS bits) to look.
Great, pinning the maximum version to 1.2 allows the page to load. However, enabling 1.3 in Canary also makes the page load - and the network traffic indeed indicates TLS 1.3 to be used. Was there a change in the TLS 1.3 code handling handshake et al.?
Again, this is a problem with your server. The net-internals log is very clear. We are getting a clean shutdown from your server after the handshake. TLS 1.3 is a brand-new experimental revision of TLS. h2o appears to have an implementation of it that shuts the connection down on us. You should talk to the h2o developers.

But, yes, it appears to be a problem with h2o's experimental TLS 1.3 implementation.
After an update to Chromium 60.0.3102.0 this issue has vanished without any change to the h2o codebase. It seems that this apparently was indeed an issue with Chromium and "WontFix" magically turned into "Actually there was an issue, lets fix it". Although I appreciate your efforts, I highly dislike that h2o was immediately held responsible for the failure. We've been auditing the h2o for several days and could not identify any disparity between the specification and the actual TLS 1.3 implementation. 

Acknowledging this issue would have saved us a lot of time.

Nevertheless, we have observed a drop of users with connection failures using Chrome from over 800 per day to around 150 per day. (This number is decreasing slowly as the users update to never versions of Chromium.)

-- Kenan
TLS 1.3 is still experimental and being deployed on and off at various percentages as we try to deal with the ecosystem issues. We did not make any recent changes to our TLS 1.3 implementation and, again, the net-internals log is quite conclusive.

There is no situation where your server would correctly send a close_notify alert in response to a problem in the client TLS stack (and we had already established it was TLS-related and not HTTP2-related). It would have sent another alert had we, say, failed to encrypt a record properly. close_notify means clean and successful connection close.

Sign in to add a comment