Fetch: data: URL spaces in the image data are removed
Reported by
jongheey...@gmail.com,
Sep 16 2017
|
||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36 Steps to reproduce the problem: 1. load http://w3c-test.org/XMLHttpRequest/data-uri.htm 2. check "XHR method GET with MIME type image/png" TC What is the expected behavior? "XHR method GET with MIME type image/png" TC should be passed. What went wrong? Spaces in the image data are removed. Did this work before? N/A Does this work in other browsers? N/A Chrome version: 60.0.3112.113 Channel: n/a OS Version: OS X 10.12.6 Flash Version: Shockwave Flash 27.0 r0
,
Sep 18 2017
Able to reproduce this issue on Ubuntu 14.04, Windows-10 and Mac OS 10.12.6 using chrome latest canary #63.0.3218.0. Tested the same on older version of chrome M50-50.0.2624.0 and observed most of the tests got failed (1 pass and 9 Fail). Considering the above issue is non-regression and marking it as untriaged for further updates. Thanks!
,
Sep 18 2017
Filed https://github.com/w3c/web-platform-tests/issues/7374 to track the W3C test in question. It should use compliant data: URLs when testing unrelated behavior.
,
Sep 18 2017
,
Oct 3 2017
Reopening, as this is a valid test (any input is valid) that Chrome does not pass. If you feel it's out of Internals>Network's scope then please identify another team responsible for implementing the spec for fetching data URLs (including any arbitrary string with a data: prefix). Regardless, we need an open issue to track all WPT failures like this; we cannot close it as WontFix.
,
Oct 16 2017
Seems to me like the spec wants us to keep spaces in all cases, and just "validation error" if they show up unescaped [1]. This makes me feel like the code in data_url.cc shouldn't ever strip spaces, except for leading/trailing (which we should be doing already in //url). +mkwst for thoughts. [1]: https://url.spec.whatwg.org/#concept-basic-url-parser
,
Oct 16 2017
As csharrison@ notes, this isn't `fetch()`-specific behavior. Typing `data:not-text/whatever,This has spaces.` into the omnibox will show the same space-stripping behavior. I'm fine with changing that; as you say, the URL spec does not strip spaces from URLs in general, and Firefox's behavior seems sane. That said, "obtaining a resource" from `data:` URLs is somewhat underspecified[1]. It seems pretty reasonable to me that we wouldn't strip spaces, but comments like those at [2] make me worried that there may be deeper incompatibilities for base64 encoding that this might uncover. +asanka@ and mmenke@ who commented on the patch in comment 1, as I think they disagree. :) +Anne, who likely also has opinions. [1]: https://simonsapin.github.io/data-urls/ [2]: https://cs.chromium.org/chromium/src/net/base/data_url.cc?l=98
,
Oct 16 2017
Looking at the data URL spec (https://tools.ietf.org/html/rfc2397), there's a reference to the data containing "urlchars", and a claim that urlchar is in another rfc (https://tools.ietf.org/html/rfc2396). That other RFC does not, in fact, have a urlchar rule. RFC 2244 does have a url-char, which includes a couple symbols plus a reference to rfc 1738...Which defines uchar to not include spaces. So, in summary: I think data URLs with spaces are not valid URLs.
,
Oct 16 2017
Could also look at "pchar" in RFC 2397 (Which looks like it may match RFC 2244's definition of "url-char"), and it also excludes spaces.
,
Oct 16 2017
domenic: Just because the input of a test is valid does not mean the expected output is correct. I claim the result should be an error.
,
Oct 16 2017
#5: Keep WPT issues on the WPT bug tracker. I'm all for having some reasonable behavior, but I disagree on keeping bugs open in Chromium's tracker to fill in gaps in the spec. While any input is valid, this issue is regarding how the browser should behave when it encounters an invalid input. Like I said on the WPT tracker and here, the test is invalid due to there not being any spec that supports the behavior that's being asserted there. Please point at the spec which asserts whitespace should be accepted in URLs, or which asserts the specific behavior being tested in the relevant test. A WPT test can then assert that behavior.
,
Oct 16 2017
-> Unconfirmed. Also, don't use Untriaged until we've confirmed this is an issue that needs to be addressed in the Chromium code base.
,
Oct 16 2017
,
Oct 16 2017
First, it looks like there's a little bit of a conflict between the way the web platform team uses the bug tracker and the way the network team uses it. I have more than a few "Huh. We should specify this somewhere." bugs open against various things in Blink, and that's generally accepted. If you'd prefer to keep things out of the Internals>Network queue, I'm sure we can work something out. Second, we're already violating both the letter of RFC 2397 by accepting space characters in `data:` URLs that have a MIME type starting with `text/` or containing the substring `xml` anywhere. It sounds like y'all would like us to go the other way by rejecting the URL entirely. I'd suggest that the RFC doesn't actually give guidance on the topic of how a client should handle a string that doesn't match the grammar. The URL spec, on the other hand, does, and I'd suggest that it should be considered controlling for the narrow question of whether or not we remove the space characters from the URL entirely while parsing. Ideally, we'd then have a spec that better defined what to do with the URL once we're finished parsing it and want to "obtain a resource" from it. Neither the RFC nor the musings linked from Fetch answer that question. Third, there's pretty substantial variance in what browsers do on the GET test in question. Chrome and Firefox fail, Safari passes, and I haven't tested Edge yet. Anne, thoughts about the behavior you think Fetch would expect?
,
Oct 16 2017
Edge fails all 10 of the tests. The base64 tests is (probably?) the only test that relies on specified behavior. To the extent of my knowledge, the net team hasn't historically been engaged in the fetch spec. i.e., with or without a network label, this bug will fall into a black hole.
,
Oct 16 2017
Re: Issue tracker use: There should be no surprise when a Chromium bug gets files in the Chromium tracker and gets triaged by the relevant owners. If something needs to be changed in a spec or updated in WPT, why are those bugs here? That the specs don't address this case is something that I've brought up multiple times both here and in the WPT tracker. Writing a WPT that asserts off-spec behavior, and then filing bugs against browsers demanding compliance is not how one should address this. The WPT should not be where behavior is specified, or else we'd end up in a state where implementers have to reverse engineer specs from WPTs. How about this instead: File an issue against the relevant spec (I don't even know which one it is at this point), and hash out how these edge cases should be handled. I understand completely that we can't require all inputs be valid, but we should be consistent with how we choose to handle those invalid inputs. I also strongly believe that this is not the right forum to have this discussion. Once the spec includes recommendations for dealing with these cases we can update Chromium's behavior, and introduce a comprehensive set of tests to WPT WRT data: URLs.
,
Oct 17 2017
[Punting this into a blackhole]
,
Oct 22 2017
https://github.com/whatwg/fetch/pull/579 has an updated data: URL processing model. I haven't been able to land it since in part getting a good story there also depends on getting a good story around MIME types (hence the other issues some of you have been seeing). It does however require not stripping spaces. If you instead want to make Chrome reject such data: URLs and get other browsers to align with that kind of behavior I guess you could argue for that, but closing this without addressing the interoperability issue seems irresponsible and anti-competitive.
,
Oct 22 2017
domenic, annevankesteren: Asking you to follow standardization procedure is neither irresponsible nor anti-competitive. Also note that in #16 I clearly state that changes to Chrome's behavior as stated in this issue will follow spec changes. The WontFix in #4 was because we moved the discussion over to WPT around the fact that the test wasn't backed by the spec. The lack of spec support has been cited enough times in this thread. I consider #18 to be completely out of line. Since it involves a serious accusation I don't believe we should engage here any further.
,
Oct 23 2017
There's a lot of variability across teams and debate on how exactly interop issues like this should be handled within the chromium project, and it's my fault that we don't have a consistent policy documented somewhere. I'm sorry about that. Can we put this debate on hold for a week or two to see if I can get consensus from the leads on a policy and get it published? Marking blocked on issue 777449 to track that.
,
Mar 28 2018
|
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by jongheey...@gmail.com
, Sep 16 2017