New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 820862 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Mac
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Invalid response with too-large Content-Length causes resource cache corruption

Reported by j...@redradishtech.com, Mar 12 2018

Issue description

UserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36

Example URL:
https://jira.jboss.org/static/util/urls.js

Steps to reproduce the problem:
1. Visit URL https://jira.jboss.org/static/util/urls.js
2. Reload the URL (by hitting enter):  https://jira.jboss.org/static/util/urls.js
3. Notice binary corruption and end of now-cached response

What is the expected behavior?
No corruption.

What went wrong?
This bug is triggered by a misconfigured Tomcat server, which serves gzipped contents with a 'Content-Length' header set to the number of uncompressed bytes (so, larger than it should be). Chrome first behaves as expected, waiting for bytes that never come, then timing out. The problem is afterwards on subsequent hits. Chrome goes a bit haywire, issuing 1-byte requests and appending binary junk of unknown origin to the cache until the Content-Length is met.

I use https://jira.jboss.org/static/util/urls.js as my sample URL, but any static resource on any public JIRA non-Cloud-hosted instance will do, e.g.:

https://jira.jboss.org/robots.txt
https://dnsprivacy.org/jira/static/util/strings.js
https://bugs.opera.com/static/util/strings.js
https://webapp.mis.vanderbilt.edu/jira/robots.txt
https://webapp.mis.vanderbilt.edu/jira/static/util/strings.js

Here are steps illustrating what goes wrong:

On the first hit of https://jira.jboss.org/static/util/urls.js Chrome gets told the Content-Length is 652 bytes:

Accept-Ranges:bytes
Content-Encoding:gzip
Content-Length:652
Content-Type:application/javascript;charset=UTF-8
Date:Mon, 12 Mar 2018 01:37:16 GMT
ETag:W/"652-1519645607000"

but only about 389 (the gzipped size) are delivered. Chrome behaves sensibly, waiting for bytes that never come, then timing out after 20s. The response is cached with a RESPONSE_INFO_TRUNCATED flag, as can be seen at chrome://view-http-cache/https://jira.jboss.org/static/util/urls.js.

On the second hit, chrome issues a weird Range:bytes=389-389 request, requesting just the 389'th byte (389 is the gzipped size):

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding:gzip, deflate, br
Accept-Language:en-AU,en-GB;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control:max-age=0
Connection:keep-alive
Cookie:atlassian.xsrf.token=AQZJ-FV3A-N91S-UDEU|4e7d69abd30a7aaf76c191001f04edaa3b1989a6|lout; JSESSIONID=A6086FDAEFDEAE7358108D1B9D71EE8B
Host:jira.jboss.org
If-Range:W/"652-1519645607000"
Range:bytes=389-389
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36

to which the server duly responds with a '206 Partial Content' response of 1 byte:

Accept-Ranges:bytes
Content-Length:1
Content-Range:bytes 389-389/652
Content-Type:application/javascript;charset=UTF-8
Date:Mon, 12 Mar 2018 01:40:30 GMT
ETag:W/"652-1519645607000"
Last-Modified:Mon, 26 Feb 2018 11:46:47 GMT
X-AREQUESTID:1300x7634499x2
X-ASEN:SEN-1095081
X-AUSERNAME:anonymous
X-Content-Type-Options:nosniff

The returned resource now has 205 bytes of binary junk at the end, as seen at chrome://view-http-cache/https://jira.jboss.org/static/util/urls.js. The RESPONSE_INFO_TRUNCATED is still there.

On the third request we again get a weird 1-byte 'Range:bytes=594-594' request:

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding:gzip, deflate, br
Accept-Language:en-AU,en-GB;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control:max-age=0
Connection:keep-alive
Cookie:atlassian.xsrf.token=AQZJ-FV3A-N91S-UDEU|4e7d69abd30a7aaf76c191001f04edaa3b1989a6|lout; JSESSIONID=A6086FDAEFDEAE7358108D1B9D71EE8B
Host:jira.jboss.org
If-Range:W/"652-1519645607000"
Range:bytes=594-594
Upgrade-Insecure-Requests:1

The byte is served up, and Chrome adds another 58 bytes of junk to its cached response, and now clears the RESPONSE_INFO_TRUNCATED flag, presumably because the total cached bytes is now equal to the Content-Length (652).

This is somewhat similar to  bug #423318 , but simpler to trigger. A google search for RESPONSE_INFO_TRUNCATED shows other people reporting this 1-byte request behaviour (https://stackoverflow.com/questions/47311027/response-info-truncated-file-in-chrome-cache). 

Did this work before? N/A 

Chrome version: 64.0.3282.167  Channel: stable
OS Version: 4.13.0-36-generic
Flash Version:
 
chrome_corrupt_cached_response.png
87.6 KB View Download
chrome-net-export-log.json.gz
142 KB Download
Labels: Needs-Triage-M64
Components: -Internals>Network Internals>Network>Cache
Cc: morlovich@chromium.org

Comment 4 by mmenke@chromium.org, Mar 12 2018

Hrm...  We really should not be sending range requests for a gzipped response body - we don't support range requests with compressed responses, so it seems like a recipe for disaster.
Labels: Needs-Bisect
Labels: -Type-Bug -Pri-2 -Needs-Bisect Target-67 Triaged-ET RegressedIn-64 Target-66 M-67 FoundIn-66 FoundIn-67 Target-65 FoundIn-65 hasbisect OS-Mac OS-Windows Pri-1 Type-Bug-Regression
Owner: shivanisha@chromium.org
Status: Assigned (was: Unconfirmed)
Able to reproduce the issue on Windows 10, mac 10.13.3 and Ubuntu 14.04 using chrome reported version #64.0.3282.167 and latest canary #67.0.3370.0.

Bisect Information:
=====================
Good build: 64.0.3247.0
Bad Build : 64.0.3248.0

Change Log URL: 
https://chromium.googlesource.com/chromium/src/+log/62cddd7214b2139a72b0f8782b649e16a8bf375e..6fc01b37e06767e502ad74fb108a96ea38e714fc

From the above change log suspecting below change
Change-Id: Idb0c0d7c5962de228c23ee5e6c5f24cbf758c383
Reviewed-on: https://chromium-review.googlesource.com/578172

shivanisha@ - Could you please check whether this is caused with respect to your change, if not please help us in assigning it to the right owner.

Thanks...!!
Not sure if the bisect is correct since the issue is not reproducible always at the second reload but sometimes after a non-deterministic number of reloads. The truncation logic hasn't changed much (except for refactoring) in the CL mentioned by the bisect and I do not see any specific handling for gzipped responses and that they should not be marked as truncated, even before the cache lock fixes. It's also possible that the recent cache lock fixes helped in surfacing the bug that was already there. 
Because of unrelated crashes I couldn't try this on earlier builds though.

I have a fix here based on Matt's comment in comment #4:
https://chromium-review.googlesource.com/c/chromium/src/+/962346
re: comment #4: in what sense is that unsupported? HttpCacheTransaction is completely unaware of Content-Encoding (the cache stores things in encoded form, after all); HttpNetworkTransaction is unaware of Range (and only verifies content encoding, but doesn't actually decode it); and I see no reason that wouldn't work, except being confusing as heck, and therefore likely to be buggy in many implementations :p

Though, a more pressing question is, what should we be doing in this case? The current behavior is clearly wrong (and may show up in other instances, so likely worth tracking down), but if we sent a range a request for the missing portion, we would probably get a 416 or something, which doesn't seem like much of an improvement in practice, even if it would give me fluffy correctness feelings? 

As such, Shivani's proposal may be the best choice for dealing with this immediate bug as a workaround, but there is also something to be said about fixing the server...


> Though, a more pressing question is, what should we be doing in this case?

FYI Firefox tosses out the cached response.

> but if we sent a range a request for the missing portion [...]

Is that even possible?

After the first request, Chrome has cached a partial gzipped response (say, 389 of the alleged 652 total). Does HTTP allow for Chrome to say "give me bytes 390-652 BUT THEY MUST BE GZIPPED EXACTLY LIKE BEFORE"?

During the design of http/2 this came up in a fascinating, head-hurting discussion:

https://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/1179.html

So it seems to me discarding the cache is the only sane thing to do, at least with HTTP/1.1.
A last thought: on the second/third request, Chrome appends binary junk to the cached resource. I wonder where that comes from, and if there is anything sensitive in it from the client that the server can now see.
Labels: Restrict-View-Google
re: comment #10: good point; restricting view just in case (probably not the exact right label for that); will also try to focus on this part while doing my "what in the world is this code even doing?" fiddling around, to see if we need some sort of security-related process.

That H2 thread seems to be about Transfer-Encoding:gzip (does anyone even do that?), not Content-Encoding: gzip. Anyway, there seemed to be some discussion on the httpbis mailing list suggesting that the intent of HTTP/1.1 is to support range requests for content-encoding gzip --- roughly agreeing with the message you linked to that it's a thing that's envisioned by the spec, and given it's a deterministic algorithm, it doesn't sound impossible to implement, just impractical, especially given that Last-Modified can be considered a strong validator; and well, what kind of server is going to change last-modified if their admin fiddles with their compression rate settings [1]?


[1] One that explicitly keeps track of gzip'd representations, I imagine.
So with respect to the 1-byte range requests... we actually send those even during the basic resume unit tests, e.g. TEST(HttpCache, GET_IncompleteResource); and I think it's intentional; it seems like the idea is to send a 1-byte request with If-Range: to see if the server is OK with range stuff, then to do a full-range request --- see PartialData::SetRangeToStartDownload, and https://codereview.chromium.org/6588105/ .... I am not certain[1] why that's better than just If-Range for the entire tail, though, but it's not incorrect on its own. In fact, I think we do send a second request to the server (and re-check our cache) in your first test URL --- from some of my debug output (note that the headers output doesn't have line breaks, so there is a chance I "reintroduced" them incorrectly):


PrepareCacheValidation --- current_range_start_:389 current_range_end_:389 range_present_:0 final_range_:0 header to send:bytes=389-389
(the one-byte probe, range_present_:0 means it's not in cache)
ResponseHeadersOK():
HTTP/1.1 206
X-AREQUESTID: 647x10931762x2
X-ASEN: SEN-1095081
Set-Cookie: atlassian.xsrf.token=AQZJ-FV3A-N91S-UDEU|c6de2a68b9b09796425b3cd1755fd87bc1e6fce3|lout;path=/
X-AUSERNAME: anonymous
X-Content-Type-Options: nosniff
Accept-Ranges: bytes
ETag: W/"652-1519645607000"
Last-Modified: Mon, 26 Feb 2018 11:46:47 GMT
Content-Range: bytes 389-389/652
Content-Type: application/javascript;charset=UTF-8
Content-Length: 1
Date: Thu, 15 Mar 2018 14:47:30 GMT

PartialData::SetRangeToStartDownload()
PrepareCacheValidation --- current_range_start_:0 current_range_end_:388 range_present_:1 final_range_:0 header to send:bytes=0-388
(reading the previously present "portion")
PartialData::OnCacheReadCompleted:389
PartialData::OnCacheReadCompleted:0

PrepareCacheValidation --- current_range_start_:389 current_range_end_:651 range_present_:0 final_range_:1 header to send:bytes=389-651
(requesting the rest from the network)
ResponseHeadersOK()
HTTP/1.1 206
X-AREQUESTID: 647x10931764x2
X-ASEN: SEN-1095081
Set-Cookie: atlassian.xsrf.token=AQZJ-FV3A-N91S-UDEU|88ded31ea2f5d5adc90388697cf245baafc135a2|lout;path=/
X-AUSERNAME: anonymous
X-Content-Type-Options: nosniff
Accept-Ranges: bytes
ETag: W/"652-1519645607000"
Last-Modified: Mon, 26 Feb 2018 11:46:47 GMT
Content-Range: bytes 389-651/652
Content-Encoding: gzip
Vary: User-Agent
Content-Type: application/javascript;charset=UTF-8
Content-Length: 263
Date: Thu, 15 Mar 2018 14:47:30 GMT

(notice the different cookie and X-AREQUESTID here)

PartialData::OnNetworkReadCompleted:205

This sure looks like the junk is just us sticking whatever the server gave us into gunzip, not some random bits. Reporter, I don't suppose you know of an http:// test URL? 
Double-checking with a packet sniffer would be reassuring. 

[1] I /suspect/ it may have been easier to implement this way since we stream out responses, so otherwise one would have to hold the cache read for the prefix until 
we can see if the response to the If-Range produced a 206 or a 200, which is kinda tricky to do the way the code is structured.


Not sure if the reporter will be able to view the issue anymore , given RVG
Labels: -Restrict-View-Google
Well, I am reasonably confident it's not a security issue now, so may as well remove it. 
Re: comment #12: I think you're right. When I save the corrupted file (urls.js), isolate the binary at the end and feed it to gunzip, I get portions of the original urls.js file. Same with strings.js.

Here is how to set up your very own buggy server, serving up http:/localhost:8080/static/util/urls.js

cd /tmp
curl -LO 'https://downloads.atlassian.com/software/jira/downloads/atlassian-jira-software-7.8.0.tar.gz'
tar xf atlassian-jira-software-7.8.0.tar.gz
mkdir /tmp/jirahome
JIRA_HOME=/tmp/jirahome /tmp/atlassian-jira-software-7.8.0-standalone/bin/catalina.sh run

Unfortunately this requires Java.

I've attached a .pcapng file of two requests to http://localhost:8080/static/util/urls.js. It corresponds to your debug output. After 'Range: bytes=389-389', we get a 'Range: bytes=389-651' request returning gzipped bytes, and this is presumably the source of the binary corruption. Interestingly this bytes=389-651 request does not show in the devtools Network tab.
chrome_820862_urls.js_tworequests.pcapng
6.2 KB Download
chrome://net-internals event view may be more enlightening for some of those things, since it's many layers closer to where things are happening. Looking at it, I see something else that looks suspect, though:
Range: bytes=389-389
If-Range: W/"652-1519645607000"

... why are using a /weak/ validator here? Probably worth a separate bug report?
Project Member

Comment 17 by bugdroid1@chromium.org, Mar 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1720d2a67b945ba58b008c593f358298b32ff0bb

commit 1720d2a67b945ba58b008c593f358298b32ff0bb
Author: Shivani Sharma <shivanisha@chromium.org>
Date: Thu Mar 15 20:27:35 2018

[Http Cache] Do not preserve incomplete compressed responses.

If a response is not complete and has a content-encoding header, it
should not be marked as truncated since we do not want it to lead to
range requests for future requests.

TEST=net_unittests --gtest_filter=
WritersTest.ContentEncodingShouldNotTruncate

Also manually verified that the test case reported in the bug is fixed.

Bug:  820862 
Change-Id: Ie3f4098e769e257632058af2215054809ca8c89f
Reviewed-on: https://chromium-review.googlesource.com/962346
Commit-Queue: Shivani Sharma <shivanisha@chromium.org>
Reviewed-by: Maks Orlovich <morlovich@chromium.org>
Cr-Commit-Position: refs/heads/master@{#543481}
[modify] https://crrev.com/1720d2a67b945ba58b008c593f358298b32ff0bb/net/http/http_cache_writers.cc
[modify] https://crrev.com/1720d2a67b945ba58b008c593f358298b32ff0bb/net/http/http_cache_writers_unittest.cc

Status: Fixed (was: Assigned)
Project Member

Comment 19 by bugdroid1@chromium.org, Mar 17 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/332bb11a6291d6b97b0b10b996bba921f4aafa4a

commit 332bb11a6291d6b97b0b10b996bba921f4aafa4a
Author: Maks Orlovich <morlovich@chromium.org>
Date: Sat Mar 17 01:43:32 2018

PartialData: comment the weird setup for truncated responses.

This may hopefully make the next person seeing reports of strange 1-byte
range requests for previously truncated files less flabbergasted.

(And also documented some fields at high-level).

Bug:  820862 
Change-Id: I3bbb8fc46c2c5a015bf8669165add631df289486
Reviewed-on: https://chromium-review.googlesource.com/964981
Reviewed-by: Helen Li <xunjieli@chromium.org>
Commit-Queue: Maks Orlovich <morlovich@chromium.org>
Cr-Commit-Position: refs/heads/master@{#543907}
[modify] https://crrev.com/332bb11a6291d6b97b0b10b996bba921f4aafa4a/net/http/partial_data.cc
[modify] https://crrev.com/332bb11a6291d6b97b0b10b996bba921f4aafa4a/net/http/partial_data.h

Cc: krajshree@chromium.org
Labels: Needs-Feedback
Able to reproduce the issue on win-10, mac 10.13.3 and ubuntu 14.04 using chrome reported version #64.0.3282.167.
Verified the fix on Win-10 and Ubuntu 14.04 using Chrome version #67.0.3375.0 as per the comment #0.
Note: On OS-mac, both on chrome reported version #64.0.3282.167 and chrome version #67.0.3375.0, resource cache corruption is seen on first load of url: https://jira.jboss.org/static/util/urls.js.
Attaching screen cast of OS-mac and OS-win on Chrome version #67.0.3375.0 for reference.
Observed that no resource cache corruption is seen on OS-win and OS-linux. Whereas on OS-mac, chrome behavior is same in both chrome reported version #64.0.3282.167 and chrome version #67.0.3375.0.

shivanisha@ - Could you pleas check the OS-mac screen cast and please let us know the expected behaviour and confirm the fix.

Thanks...!!

820862@mac.mp4
401 KB View Download
820862@win.mp4
750 KB View Download
Re comment 20: On Mac can you please check the test case after cleaning the browser cache using the following steps:

More Tools -> Clear browsing data -> Clear Data (Note that Cached images and files is checked)
Labels: TE-Verified-M67 TE-Verified-67.0.3375.0
As per comment #20, verified the fix on mac 10.13.3 using Chrome version #67.0.3375.0
Attaching screen cast for reference.
Observed that after cleaning the browser cache using the following steps:
More Tools -> Clear browsing data -> Clear Data (Note that Cached images and files is checked), no resource cache corruption is seen on mac 10.13.3.
Hence, the fix is working as expected. 
Adding the verified labels.

Thanks...!!
820862@M67_mac.mp4
1.0 MB View Download

Sign in to add a comment