Legacy website content is rendered with incorrect text encoding [due to missing override UI]
Reported by
j...@kodewerx.org,
Sep 23 2017
|
||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36 Example URL: http://number-none.com/product/Understanding%20Slerp,%20Then%20Not%20Using%20It/ Steps to reproduce the problem: 1. Visit the link 2. Notice quote characters are shown as black question-marks 3. User is unable to change the encoding to display the content correctly What is the expected behavior? What went wrong? The embedded Content-Type is: <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> The response header Content-Type is: Content-Type: text/html; charset=UTF-8 This content is clearly invalid. But in Safari I can change the text encoding to "ISO Latin 1" to fix it. In Firefox, the text encoding "Central European (Windows)" works. Chrome removed the menu item allowing users to change the auto-detected encoding in #597488 Does it occur on multiple sites: N/A Is it a problem with a plugin? No Did this work before? Yes Does this work in other browsers? No Safari 11.0 (12604.1.38.1.7) and Firefox 55.0.3 Chrome version: 60.0.3112.113 Channel: n/a OS Version: OS X 10.12.6 Flash Version: Shockwave Flash 27.0 r0 This can be fixed by either improving the auto-detection (ignoring the encoding hints in headers and meta tags), or by allowing users to override the detected encoding.
,
Sep 27 2017
,
Oct 4 2017
To reproduce this using the attached script: 0. Start the script listening on some port, e.g. 9999: perl encoding_test.pl 9999 1. Go to that page and observe broken characters. http://localhost:9999/ 2. Save the page locally (by downloading directly, as opposed to using Chrome), then view the local copy of the page. Observe no broken characters. wget http://localhost:9999/ -O /tmp/a.html file:///tmp/a.html
,
Mar 26 2018
Extensions are available in the Chrome Store to add UI to change the encoding.
,
Mar 26 2018
Can you post link to a recommended extension? I am looking for one that simply respects the windows-1252 encodings specified in the HTML pages, as opposed to having the server-side content-type override the page encoding.
,
Mar 26 2018
Those extensions work by modifying response headers using an onHeadersRecieved listener. I found a bunch of examples on github, FWIW. AFAIK, that means it's impossible to make a decision on which encoding to use based on the response body; the headers need to be finalized before the body can be processed. See https://developer.chrome.com/extensions/webRequest for more details.
,
Nov 21
**Mass UI Triage** We were unable to reproduce this bug as per comment #0. If this bug still reproduces for you, please reopen or file a new issue. Thanks..!!
,
Nov 21
Apparently the page from comment #0 got fixed, see comment #3 for a reliable way to reproduce.
,
Nov 21
Still reproduces for me using my script from comment #3. |
||||
►
Sign in to add a comment |
||||
Comment 1 by ranjitkan@chromium.org
, Sep 25 2017Components: Blink>Fonts
Labels: -Type-Compat Needs-Milestone M-63 OS-Linux OS-Windows Type-Bug
Status: Untriaged (was: Unconfirmed)