New issue
Advanced search Search tips

Issue 700618 link

Starred by 4 users

Issue metadata

Status: Archived
Owner: ----
Closed: Mar 2018
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

Cookie parsing terminates at the first semicolon, ignoring quotes

Reported by mschwe...@salesforce.com, Mar 11 2017

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36

Steps to reproduce the problem:
(using backticks to quote since this is a problem with quote characters.)

1. Web server sends `Set-Cookie: foo="bar;baz";Version=1;Path="/secur"`
2. Browser stores that cookie (visible in the interface via Cookies and Site Data) as `"bar`
3. Note the lack of an end-quote.
4. Browser sends Cookie: foo="bar; othercookies; more; etc;
5. Server recognizes a quoted string and creates a big cookie `foo` whose value is `"bar; othercookies; more; etc`
6. Session cookies, and more, are lost if they are later in that Cookie: line.

(Note: This might be #notallservers but it does appear that Jetty, for certain, will treat that Cookie: header as a big quoted string.  I have not tested other web servers, but I believe the core problem is in the browser, because it's losing data.)

What is the expected behavior?
The browser should store the entire quoted string as a unit, including the semicolon.  It should be visible in Cookies and Site Data as
`"bar;baz"` and that is what should be sent to the web server.

What went wrong?
Only part of the cookie was sent to the server, and that part did not include the close-quotes.

Did this work before? N/A 

Chrome version: 56.0.2924.87  Channel: n/a
OS Version: OS X 10.12.2
Flash Version: Shockwave Flash 24.0 r0

There is a related bug on WebKit from 2008 (https://bugs.webkit.org/show_bug.cgi?id=16699) but my understanding is that Chromium is no longer WebKit-based, so...whether this is a 'regression' or not is unclear.  It may never have worked with Blink. ¯\_(ツ)_/¯
 

Comment 1 by woxxom@gmail.com, Mar 11 2017

This is working correctly per specification. The doublequotes are simply optional decorations that don't modify the restriction on allowed characters. The server in question is violating the spec and must switch to using URL-encoding to encode invalid characters.

Full specification:  https://tools.ietf.org/html/rfc6265#section-4.1

	cookie-name    = token
	cookie-value   = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )

	cookie-octet   = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
	                    ; US-ASCII characters excluding CTLs,
	                    ; whitespace DQUOTE, comma, semicolon,
	                    ; and backslash
	                       
	token          = 1*<any CHAR except CTLs or separators>
	separators     = "(" | ")" | "<" | ">" | "@"
	              | "," | ";" | ":" | "\" | <">
	              | "/" | "[" | "]" | "?" | "="
	              | "{" | "}" | SP | HT

Simplified spec: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie

Comment 2 by woxxom@gmail.com, Mar 11 2017

On the other hand, depending on the amount of similarly broken servers in the wild, it's worth considering if Chrome should be less strict in following the spec and instead try to handle such malformed cookie headers by treating the quoted string as a whole.

Comment 3 by rsesek@chromium.org, Mar 13 2017

Components: Internals>Network>Cookies
Labels: TE-NeedsTriageHelp
This issue seems to be out of TE-scope. Hence, marking label TE-NeedsTriageHelp for further investigation.

Thanks...!!

Comment 5 by mmenke@chromium.org, Mar 14 2017

I don't think we should make our cookie parsing more permissive - history has shown this is much more likely to add to cross-browser compatibility issues than stricter adherence to spec, and the current code doesn't seem to be running into issues with most sites.
Some history ...

RFC2109 (now obsolete) introduced the logic for name="quoted value" in Version=1 cookies
RFC2965 (also obsolete) kept the logic for name="quoted;value" allowing the quotes to enclose any US-ASCII (non-ctrl) values.
RFC6265 is the first time the spec specifically removes support for this quoted string logic.

Java 8u121 and java.net.HttpCookie only supports up to RFC2965
Servlet 3.1 also only supports up to RFC2965

This is *really* wide spread.  

Note: this Version=1 Set-Cookie behavior also true on Google App Engine.

Comment 7 by mmenke@chromium.org, Mar 14 2017

Worth noting that the "simplified" Mozilla spec woxxom linked to allows semi-colons.  Anyhow, if this isn't a recent regression (Which I don't think it is?) seems best to keep current behavior, to align with the newer spec.

Comment 8 by woxxom@gmail.com, Mar 14 2017

#7 >Mozilla spec [...] allows semi-colons

Does it? It's awkwardly phrased but I read it the same as RFC6265:

>A <cookie-value> can optionally be set in double quotes and any US-ASCII characters excluding CTLs, whitespace, double quotes, comma, semicolon, and backslash are allowed. Encoding: Many implementations perform URL encoding on cookie values, however it is not required per the RFC specification. It does help satisfying the requirements about which characters are allowed for <cookie-value> though.

In short, regardless of being doublequoted any US-ASCII characters excluding [...] semicolons [...] are allowed.
Also, the part "It does help satisfying the requirements" refers to URL-encoding clause, not to doublequotes.

Comment 9 by gr...@webtide.com, Mar 14 2017

It is rather a significant deficiency in RFC6265 that it provides no standard mechanism for encoding excluded characters and effectively turns the previously accepted quoting mechanism into a noop.  Thus servers are faces with a choice of breaking existing applications or adhering to RFC2965 rather than adopting that latest version.

However, the jetty project does accept that strict adherence to specifications (even deficient ones) in the output generated is a good policy.  But it can also be good policy to be more flexible in what is parsed, so the jetty project will continue to treat a received quoted ; as per the RFC2965 definition.   See https://github.com/eclipse/jetty.project/issues/1396


Note that for Chrome, it could also be more flexible in parsing something like:

  Set-Cookie: foo="bar;baz";Version=1;Path="/secur"

As terminating the value at the first ; is also not compliant with 

  cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
  cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                       ; US-ASCII characters excluding CTLs,
                       ; whitespace DQUOTE, comma, semicolon,
                       ; and backslash

because the DQUOTE in the value provided is not balanced in the way that the RFC requires and it also cannot be part of the value.

So given a non compliant value string, the browser must choose it's behaviour:  It could just reject the entire header or response;   It could decide on a non-compliant value of "bar   ; or it could consider RFC2965 and be a bit flexible and use bar;baz as the value!

I don't see why the "bar interpretation is more valid that the bar;baz, as both are non compliant.  However the former breaks RFC2965 compliant servers.


So I think both client and server should be strict in what we generate and forgiving in what we parse.








Being forgiving in what we parse almost always leads to compatibility issues, and it can security issues as well.

While from a "This browser works in all cases other browsers works" standpoint, it seems like you'd want to be the most forgiving browser, but experience has shown that browser don't implement things consistently, which does a disservice to web developers.  Also, if you adhere more conservatively to the standard, things that work in your browser will work in other browsers, rather than people having to test each browser to see if their non-compliant code runs.

Accepting just a partial HTTP response with no terminating double CRLF, for instance, allowed malicious MitM proxies to remove the "; secure" from set-cookie responses sent over HTTPS, and thus they could be stolen over HTTP.  HTTP/0.9 leads to problems, etc.  Also worth noting that all browser behave differently in a lot of weird cookies cases - Safari handles "Set-Cookie: foo" differently from everyone else, for instance (Not to pick on Safari, there are plenty of other differences with non-ASCII characters and such).

In general, we should not make parsing any more lax than it already is, and should instead make it more stringently adhere to specs instead, when we reasonably can.
I'd be fine with rejecting invalid cookie header lines, but changing behavior and potentially breaking things that relied on the old behavior requires a lot more follow through in case too much breaks, which I don't have the time to spend on this.

Comment 12 by gr...@webtide.com, Mar 15 2017

I think there is a bit of a difference between being so flexible that you will accept almost anything (eg you example of a non terminated HTTP response), versus accepting something that was accepted by the prior RFC.

In this case, by accepting the value as "bar, you are not only accepting a value that does not comply with any RFC, but you are then sending it back to the server as in a way that also doesn't comply with any RFC.   So from the server point of view, we'd much rather you rejected it rather than send us an unbalanced quoted value that we have to deal with.

Totally understand that there is a lot of effort involved to flexibly except RFC2965 an that it may not be worth your while.

We'll upgrade to enforce RFC6265 in our generation and see how many applications notice/complain.



Project Member

Comment 13 by sheriffbot@chromium.org, Mar 15 2018

Status: Archived (was: Unconfirmed)
Issue has not been modified or commented on in the last 365 days, please re-open or file a new bug if this is still an issue.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment