Issue metadata
Sign in to add a comment
|
Address bar decodes some URL escaped characters
Reported by
abama...@gmail.com,
Apr 11 2018
|
||||||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36 Steps to reproduce the problem: 1. Go to any base page (google.com works fine) 2. One the console (Ctrl+Shift+J) 3. Input this javascript command: window.history.pushState(null, null, '?x%25%3Cx%25'); What is the expected behavior? The URL should display the URL-encoded query-string, like: ?x%25%3Cx%25 What went wrong? Instead, it displays as: ?x%25<x%25 Did this work before? N/A Chrome version: 65.0.3325.181 Channel: stable OS Version: 6.1 (Windows 7, Windows Server 2008 R2) Flash Version: In the console, you can check the value with window.location.search, and it *does* output the correct value. So the problem is probably with something that updates the address bar itself. Clicking the address bar and copying the value does gives the un-escaped version of text, so it's not just how it renders the text. This is how I noticed the bug -- I copied such a link and pasted it into another application, where the auto-detected hyperlink was terminated right before the "<", because that's not a URL-encoded character. I thought the problem was with my page failing to encode the value before pushing it to the history, but when I checked, it was encoded, it was just wrong in the address bar. The only other browser I checked is MSIE 11, where it does work as I expected it should.
,
Apr 11 2018
,
Apr 11 2018
Tagging with some (additional) appropriate components. Maybe someone on those teams knows where this goes / what should be done with it.
,
Apr 11 2018
I can reproduce the copy bug only if I manually select a part of the URL in the address bar. Both Firefox and Chrome (since at least v23) display "<" but copy it as %3C when the entire URL is selected, which is the default and most common case.
,
Apr 12 2018
This is correct behaviour. U+003C (<) is a legal character in a URL according to [1]. If found anywhere in the path, query or fragment, it is considered equivalent to "%3C" (it normalizes to "%3C" during parsing). So it is safe to display it as "<" and more user-friendly to do so. Note that URL Spec [1] doesn't really say anything about how URLs should be rendered. In fact, it says that all percent-encoded bytes should be decoded when rendering, which is clearly broken, and I have been trying to fix this [2]. The older URL spec [3] does not allow "<" so this would have been considered an illegal URL under the previous spec. Perhaps a compatibility issue. But note that if you copy the entire URL from the Omnibox, it gets re-encoded to "%3C" (the "normal" form of the URL). It's only if you copy a select piece of the URL that it doesn't, because it's considered to not be copying a URL. Perhaps that's what you're doing? [1] https://url.spec.whatwg.org [2] https://github.com/whatwg/url/issues/369 [3] https://tools.ietf.org/html/rfc3986 |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by chrishtr@chromium.org
, Apr 11 2018