Chrome allows Content-Encoding: gzip to apply to subset of response body |
|||||
Issue description(Apologies if dupe; I couldn't find another instance) Version: Up to latest (56.0.2887.0) Repro: Visit https://bayden.com/test/partialgzip.aspx, a page which specifies Content-Encoding: gzip, but applies compression only to the first half of the body. Observe: Uncompressed data after compressed content is appended to the document body. Observe: Other browsers terminate body after compressed content ends. Thought: This could be used to circumvent content-filtering proxies and the like. I suspect that this is due to the following code: https://cs.chromium.org/chromium/src/net/filter/gzip_filter.cc?q=extra+file:%5Esrc/net/filter/&sq=package:chromium&l=77&dr=C if (decoding_status_ == DECODING_DONE) { if (GZIP_GET_INVALID_HEADER != gzip_header_status_) SkipGZipFooter(); // Some server might send extra data after the gzip footer. We just copy // them out. Mozilla does this too. return CopyOut(dest_buffer, dest_len); } Despite the comment, Firefox does not appear to use the data-- it just reads the extra and ignores it.
,
Oct 12 2016
I am marking this as Available and leaving this as a P3. Please adjust if needed. I ported this behavior to the new Gzip filter implementation, gzip_source_stream.cc and wrote a test for this behavior in GzipSourceStreamTest.PassThroughAfterEOF. Since the comment that "Mozilla does this too" is no longer valid, we should either fix the comment or the implementation.
,
Oct 12 2016
Oops, misread the description, this isn't for range requests, this is just for the case where only the first part of a response is gzipped.
,
Oct 12 2016
Hrm...Seems like we have a couple exciting options here, if we want to fix this: 1) Read and ignore the data (This would put the extra data in cache, which is kinda weird). 2) Don't read the data and consider this a success (This would have the result of the cache entry being considered truncated, which is kinda weird). 3) 2), but work around the cache issue (There's a magic method to do this, for the redirect-with-body case, which I think would work here). 4) Do 2) or 3), but ensure we don't reuse the connection for HTTP/1.x, which gets tricky. 5) Return the entire body, but then return a net::Error. 1-4) would presumably be basically compatible with FireFox's behavior. 5 would not be. Downloads would fail in this case, and I'm not sure what would happen with webpages.
,
Oct 12 2016
Another option would be to consider it an error over some connections (QUIC, HTTP2, maybe HTTPS2?), but not others (HTTP/1.x?)
,
Oct 12 2016
My basic response is that I'd hate to mess up the code to handle this case, so I want to know whether we can get away with doing something quick&dirty (in either the success direction (1), or the failure direction (5)). Which probably requires some data as to how often this happens.
,
Oct 12 2016
Note that 1 and 2 have the same complexity (I agree that 3 and 4 aren't worth it). I'd really like to go with 5), but I have a feeling well get ~2-4 complaints from web developers if we do, which is why I added the less aggressive option (Let's call it 6, in a fit of inspiration).
,
Oct 12 2016
Hrm...Thinking on this a bit more, I wonder in which cases both Chrome's and FireFox's behaviors would both "work" for a broken server. For HTML/CSS/JS, I guess as long is the extra data is just whitespace, they're be fine. If it's a binary download...I'm honestly not sure. Just thinking about this because the less compatible our behavior is with FireFox's, the less likely it is that 5) will cause problems. I think you're right that we really should start with a histogram, with maybe a special case where it's just whitespace at the end.
,
Oct 12 2016
Yeah. I'm wary of getting into a can-kicking situation ("More data!" is always valuable :-}) but I bifurcate pretty strongly on how to solve this problem depending on how often it occurs.
(Not where we're going at the moment, but in terms of 1 vs. 2, I think wasting space in the cache is better than having it have a strange state, if the problem doesn't happen very often. Agreed the complexity of the code is the same.)
,
Oct 12 2016
Truncated entries aren't exactly a strange state, even if the entry is nominally complete - there are other expected ways to have entries in this state, so don't think 2) is so bad. I'd really like to just be able to decide the socket is borked and terminate the connection, if we decide to go with 1 or two (Seems like we could be seeing the start of the next HTTP response, for instance), but can't currently do that from a layer above the HttpNetworkTransaciton, I believe.
,
Oct 13 2017
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. If you change it back, also remove the "Hotlist-Recharge-Cold" label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Feb 16 2018
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by mmenke@chromium.org
, Oct 11 2016Components: -Internals>Network>HTTP Internals>Network>Filters Internals>Network>Cache