Non-special URLs do not encode any characters above U+001F [Spec compat] |
|
Issue description
Chrome Version: 64.0.3282.140
OS: Linux (but presumably all)
What steps will reproduce the problem?
(1) Open console.
(2) new URL('web+custom: !"$%&\'()*+,-./0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f').pathname
What is the expected result?
"%20!%22$%&'()*+,-./0123456789:;%3C=%3E@ABCDEFGHIJKLMNOPQRSTUVWXYZ[/]%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~%7F"
What happens instead?
" !"$%&'()*+,-./0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f"
The characters SPACE, "<>^`{|} and U+007F should be escaped, as per the URL Standard [1] and the way that Chrome normally encodes the path segment. None of these characters are being escaped when it is a "non-special" URL (i.e., not http or another common web URL scheme). Note that C0 controls (U+0000 to U+001F) are escaped correctly. It's concerning that U+007F is not escaped correctly.
While the URL standard does make a number of distinctions for special vs non-special URLs, this is not one of them. Firefox correctly escapes SPACE and U+007F, but not the others.
[1] https://url.spec.whatwg.org/#path-percent-encode-set
,
Feb 8 2018
Note: This also affects the GURL constructor (which, I assume, is independent of whatever implementation drives the JavaScript URL constructor).
,
Feb 20 2018
#2 Javascript URL (https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/url/DOMURL.h) is backed by KURL (https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/weborigin/KURL.h) which uses the same URL parser url::Parsed (https://cs.chromium.org/chromium/src/url/third_party/mozilla/url_parse.h?l=77&gsn=Parsed) as GURL. |
|
►
Sign in to add a comment |
|
Comment 1 by mgiuca@chromium.org
, Feb 7 2018