New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 809891 link

Starred by 3 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

Resources loaded from disk cache losses original headers

Reported by jon.ronn...@gmail.com, Feb 7 2018

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3341.0 Safari/537.36

Steps to reproduce the problem:
1. Go to https://jsfiddle.net/dotnetCarpenter/u129xnjd/
2. Open devtools and open the network tab
3. With "Disable cache" checked, click on "Run" in jsfiddle
4. Uncheck "Disable cache" and click "Run" in jsfiddle
5. Observe that the resource refuse to load because of missing 'Access-Control-Allow-Origin' header.

What is the expected behavior?
Since the original resource has 'Access-Control-Allow-Origin' and a few others like 'Access-Control-Expose-Headers', the browser cached version should have the exact same so the resource can be used in the exact same way as would it have been a fresh load from server. In other words, the cache SHOULD be transparent.

What went wrong?
The cache changes the resource headers, which in turn changes how the resource can be used. Leading to errors.

Did this work before? N/A 

Does this work in other browsers? Yes

Chrome version: 66.0.3341.0  Channel: canary
OS Version: 10.0
Flash Version:
 
chrome_cache_header_bug.js
449 bytes View Download
Labels: Needs-Triage-M66
Components: Internals>Network>Cache Blink>SecurityFeature>CORS
Labels: Needs-Feedback
I'm not able to reproduce this. The HTTP cache certainly serializes headers and doesn't treat CORS headers as special. I don't know if any of the layers on the Blink side do anything odd here.

Could you attach a NetLog of it happening per these instructions? That'll hopefully tell us if the net stack proper is doing anything weird. Thanks!
https://dev.chromium.org/for-testers/providing-network-details

Comment 5 by mkwst@chromium.org, Feb 8 2018

Cc: tyoshino@chromium.org toyoshim@chromium.org yhirano@chromium.org
+tyoshino@, +yhirano@, +toyoshim@
Me too tried to reproduce it following the instruction provided, on ToT (r535350), but couldn't.
This bug was hindering my work so I deleted the cache and haven't seen the issue today. I first saw this a few days ago and contacted our server administrator for AWS and he assured me that the AWS configuration hadn't been changed. We uploaded the files a month ago and I haven't cleared the cache in Canary since. So the cached images are around a month old.

Either this is a symptom of a freak coincidence with an update to Canary that caused data corruption or this bug only reveal itself after a certain number of days. Wait.. I take that back. I'm still seeing this in Chrome 64.0.3282.140.


chrome-net-export-log.json
149 KB View Download
Project Member

Comment 8 by sheriffbot@chromium.org, Feb 8 2018

Cc: davidben@chromium.org
Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "davidben@chromium.org" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: Needs-Feedback
I made a request to preview-0003.jpg from the same origin (i.e., https://napp-siesta-test.s3-eu-west-1.amazonaws.com/) and it returned a response without access-control-allow-origin, access-control-expose-headers, etc.

Is it possible that a response to such a same-origin request is stored in the cache?
Oh, nice catch! That's the bug. If the resource does different things based on, say, the Origin header, you need to set Vary: Origin on the response.

In general, if your resource is cachable, all headers that go into deciding what you return need to be put in the Vary header. It appears you do send the correct Vary header in your response with CORS headers (Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method), but you also need to send it in the other one, so the client knows the absence of those headers matters too.

(Alternatively, do you actually need to vary on those headers? Seems you probably could just always send the CORS response and call it a day.)
Are you saying that
https://napp-siesta-test.s3-eu-west-1.amazonaws.com/siesta-demo-mwqxbrqx/uploads/1506688653629/preview-0003.jpg
sometimes return an image without any access control headers? In that case
it's a bug at Amazon.

If possible can you post a copy of the response, so that I can show it to
Amazon support?

On Feb 9, 2018 12:45 AM, "yhir… via monorail" <monorail+v2.472221124@
chromium.org> wrote:
Project Member

Comment 12 by sheriffbot@chromium.org, Feb 8 2018

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "yhirano@chromium.org" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
$ curl -s -D - https://napp-siesta-test.s3-eu-west-1.amazonaws.com/siesta-demo-mwqxbrqx/uploads/1506688653629/preview-0003.jpg -o /dev/null 
HTTP/1.1 200 OK
x-amz-id-2: 12dHqlgbdiWri5z/vzGCKEZcmOt65tLT6M8KSFbllJadB66my7ANwoWFcK7nY7BRIDhRjQ18xio=
x-amz-request-id: 5BEAF7E1B1EA068B
Date: Thu, 08 Feb 2018 23:57:05 GMT
Last-Modified: Wed, 03 Jan 2018 15:15:22 GMT
ETag: "069510115b335d44a79012f2fc7ca0a9"
Cache-Control: cache-control: public,max-age=31536000,immutable
Accept-Ranges: bytes
Content-Type: image/jpeg
Content-Length: 89969
Server: AmazonS3

$ curl -s -D - https://napp-siesta-test.s3-eu-west-1.amazonaws.com/siesta-demo-mwqxbrqx/uploads/1506688653629/preview-0003.jpg -o /dev/null -H 'Origin: http://example.com'
HTTP/1.1 200 OK
x-amz-id-2: Gb172c9Tnm6OK9dwhIuyctniYuQG9fake7dlbX7TwjsALJgsuvcXhwp5b+RhhodV1JODOjzOEKE=
x-amz-request-id: DAD2AC61EA441F22
Date: Thu, 08 Feb 2018 23:57:21 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST
Access-Control-Expose-Headers: Accept-Ranges, Content-Encoding, Content-Length, Content-Range
Access-Control-Max-Age: 3000
Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
Last-Modified: Wed, 03 Jan 2018 15:15:22 GMT
ETag: "069510115b335d44a79012f2fc7ca0a9"
Cache-Control: cache-control: public,max-age=31536000,immutable
Accept-Ranges: bytes
Content-Type: image/jpeg
Content-Length: 89969
Server: AmazonS3


If Chrome sees the former, it has no reason to believe that the latter request will produce a different response, since it wasn't advertised in the Vary header. That's why you're getting the troubles you're getting. Indeed if I just load the URL normally and then try your jsfiddle, I can reproduce it.

To clearify, once we upload something to AWS, it never expires and there is
no exceptions to this rule. All our resources are static forever and will
never ever change. They can only be deleted. Updates will have a different
URL. So caching and setting headers should be very straight forward. We
simply want a resource to be cached forever on any proxy and on any device.
Right, the issue is not that your resource is cached (although it's actually not cached very well for cross-origin requests because the cross-origin ones have Vary: Origin attached). It's that it's cached incorrectly.

AWS is serving two different versions of your resources, one with CORS headers and one without. It appears to be doing:

if request.HasHeader('Origin') {
   SendResponseWithCORSAndVaryHeaders();
} else {
   SendResponseWithoutCORSOrVaryHeaders();
}

This is incorrect. There are two options for fixing this:

1. Always serve the CORS headers, so there is only one version of your resource. Then you don't need any Vary headers at all, and will get somewhat better cache performance.

- OR -

2. If you must serve CORS headers conditional on whether Origin is present (I can't think of any reason to do this...), the CORS-less version must *also* include Vary: Origin.
Thank you so much yhirano and David! I will take this to our AWS guy and
find out if this is our configuration or something in the default setup on
AWS. It's probably the former but now I have proof! Thanks again for your
detective work and awesome explanation.
FYI turns out that there is nothing in our AWS configuration object that hints that we either want Vary header or not set CORS headers on each response.

I have just written AWS support and are awaiting there response.

Our AWS configuration:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
    <AllowedOrigin>*</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <AllowedMethod>POST</AllowedMethod>
    <MaxAgeSeconds>3000</MaxAgeSeconds>
    <ExposeHeader>Accept-Ranges</ExposeHeader>
    <ExposeHeader>Content-Encoding</ExposeHeader>
    <ExposeHeader>Content-Length</ExposeHeader>
    <ExposeHeader>Content-Range</ExposeHeader>
    <AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
If you need a pointer to the spec for AWS folks, it explicitly describes sending a Vary on a header that wasn't in the request:
https://tools.ietf.org/html/rfc7231#section-7.1.4

   For example, a response that contains

     Vary: accept-encoding, accept-language

   indicates that the origin server might have used the request's
   Accept-Encoding and Accept-Language fields (or lack thereof) as
   determining factors while choosing the content for this response.

In particular, note the "(or lack thereof)". AWS is using the lack of an Origin header to chose a request and should advertise this, so our HTTP cache knows what to do.
Labels: Triaged-ET Needs-Feedback
@Reporter: Could you please check comment#18 and report us back whether this is reproducible from your end or not.

Thanks!
@Sindhu... well, I don't have good news. We have contacted AWS support and got the following answer (below I will explain why it's not a viable solution for us at all):

> Hi,

    Thank you for contacting AWS Premium Support. My name is [removed] and I will be assisting you today.

    I understand your issue is regarding CORS response headers from S3. Let me briefly describe your issue, so that I can confirm we are on the same page. Please feel free to correct me if my understanding is not correct.

    - You have CORS configuration on S3 bucket 'napp-siesta-test'
    - Cache control headers are set for the objects to be cached.
    - When a request is made without the Origin header, the CORS headers are not returned by S3, and the object is cached. The Vary header is also not returned.
    - When a request is made by setting the Origin header, the CORS headers returned by S3. It also returns a Vary header in the response.

    Ideally, S3 should return the Vary header once CORS is configured on the bucket. Currently, that isn't the case, and S3 team is already aware of this. If the Origin header is not set in the request, S3 will not consider it a CORS request (even though CORS configuration is set), and does not send CORS response headers back in the response.

    https://docs.aws.amazon.com/AmazonS3/latest/dev/cors-troubleshooting.html

    I can think of two workarounds to serve the content with the CORS headers. Both the workarounds require CloudFront distribution in front of S3. You should restrict direct access to the S3 objects by configuring Origin Access Identity on the distribution [1].

    1) First workaround is to drop the Vary header, and only serve the CORS response every time.

    - You should set the 'Origin Custom Headers' while configuring the distribution. What this will do is, every time a request is made (with or without Origin header), CloudFront will add/override the origin header and make the request to S3. The response will have the CORS headers.
    - CloudFront, by default will drop the Vary header if the value of the Vary header is Vary:Origin, Access-Control-Request-Headers, Access-Control-Request-Method

    2) The second workaround is to make CloudFront add the Vary header every time, by using Lambda@Edge. Please refer the following forum post which explains this (the last post on the thread):

    https://forums.aws.amazon.com/thread.jspa?messageID=796312#xC2698;

    I will +1 the internal ticket from my side for sending Vary header for CORS enabled bucket every time. However, I do not have an ETA to share with you. You can track the latest releases and news about AWS from:

    https://aws.amazon.com/new/

    I hope this helps. If you have any further queries, please do not hesitate to reach out to us again.

    [1] Origin Access Identity CloudFront
        https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-s3.html

    Best regards,

We are a very little software company and the proposed solution to buy CloudFront is not an option for two reasons. First we have a backend that adds data to the bucket and CloudFront wraps an entire bucket, so we would have to re-write our backend to work with CloudFront. Another option would be to have two different buckets, the current S3 for our backend and creating a new one for our web facing client-side code. But that would also require us to re-write our infrastructure. The last issue is the pricing. We're not making enough money on our project to get a profit if we also have to buy CouldFront.

Since AWS recognise the issue (but can't give us an ETA on a bugfix) we can wait it out and see if they will fix it within some weeks. I don't know if we can public shame Amazon into fixing an obvious violation of RFC7231, since they should set Vary: Origin on all responses since that is what they do. As David has mention this is not the best solution but at least it wouldn't break caching globally. I expected something better from one of the world biggest server infrastructure providers but the fact that we even had to buy support just to make them aware of their issue and that the proposed solution is to buy another product from them... I'm a little pissed off, how they are handling this.

Unless someone has another idea, I think we have to wait and see if they will fix the S3 configuration in regard to CORS and caching. For our product, as long as a user never access any resource directly (image or SVG) then everything will work. The response with CORS headers will be cached. The issue is that it's pretty easy to break. Just visit a resource directly and all intermediate caching servers will cache the resource without CORS headers. Since I could resolve the issue locally by just deleting the cache in Chromium, it appear that there is no caching between me and AWS. Not sure why that is or if I can trace-route the package. I think most shared caching servers are transparent, so I wouldn't get anything meaningful. But I'll give it shot.

In regard to Chromium network component, I think this is configuration error on Amazon's part and not an Chromium issue. But you guys are the experts here. Not sure what you need in order to close this.

Regards, Jon
Project Member

Comment 21 by sheriffbot@chromium.org, Feb 13 2018

Cc: sindhu.chelamcherla@chromium.org
Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "sindhu.chelamcherla@chromium.org" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Status: WontFix (was: Unconfirmed)
As per comment#20 closing this issue as Wont-fix as this is configuration error on Amazon's part and not chromium issue.

@Reporter: Please feel free to raise a new issue if issue is still seen.

Thanks!
We got an answer from the S3 engineering team. Basically it's another WontFix, blaming Chromium to be incorrect.

> S3 does not fully support the Vary header which is fine in most cases because S3 does not Vary objects based on incoming request parameters. In this case in question Chrome treats cached objects different compared to all other browsers causing CORS requests to break because the cached object (non-CORS request) does not have the Vary header set. If the object in the browser's cache was made with a CORS pre-flight request to S3 then things would work as expected.

    There is a bug report which is linked from the Stack Overflow if you dig a bit deeper

    https://bugs.chromium.org/p/chromium/issues/detail?id=409090

    The only solution here would be to change our response semantics to respond with the Vary: Origin header. Likely this will break just as many (or more) customers than it fixes. We could try to limit the impact by only responding with Vary: Origin if the object has CORS support enabled in S3 but this is still a change to our API behavior and not something we believe we would do (or should do). Chrome has said this isn't a bug but all other browsers 'work' here so given the confusion and temperamental nature around CORS in general we wouldn't change anything else in S3 CORS support.
>


Personally I now see this issue as a known issue that we can not fix in our product. Should it happen again we will ask our users to delete their browser cache and hope that the issue will be resolved.
Their response is incorrect. CORS sits at a layer above the HTTP cache in all browsers. The silliness around "confusion and temperamental nature" is nonsense. The HTTP spec is very clear on how these requests behave. If you change your response based on a header, you stick it in the Vary header.

Other browsers may be sharding their caches by certain contexts or implicitly sticking a Vary Origin in here to work around buggy sites.

Adding a Vary header is quite unlikely to break other customers as clients must tolerate headers getting added. After all, they already send other random headers such as x-amz-request-id. Yes, there is risk involved in any change, but that's how deploying software on the Internet works. It's truly disappointing to see that S3 is unwilling to fix bugs in their service.
@david to add to your argument, I can tell that I see the same behavior in Safari as in chromium. Given that chromium share the same code base (at least did), that makes sense.

 I think it would weigh in more, if you from the network team wrote directly to AWS, although it seems tricky without buying support.
I prodded some contacts there earlier on, though they don't work on S3 directly and said they couldn't promise a fix. I'll prod them again.

(Thanks for the data point about Safari! That's very useful.)
We talked about the behavior with the spec author some time ago. https://github.com/whatwg/fetch/issues/402
The Vary header's semantics are lower level than Fetch. It's part of HTTP itself. If you change your response based on a header, you should include that header in Vary to allow caches to separate them.

The note in the spec you cite there is about one particular situation where this naturally occurs. It is not exhaustive. In the scenario that you send ACAO:* conditionally on the Origin header, you should indeed send Vary:Origin. Of course, that scenario itself is rather nonsensical as you may as well unconditionally send ACAO:* and get better caching effects. Then you do not need Vary:Origin. This was described up in comment #15.
Cc: viswa.karala@chromium.org
 Issue 848397  has been merged into this issue.

Sign in to add a comment