New issue
Advanced search Search tips

Issue 638001 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2016
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Unlabelled US-ASCII pages with ~{ : are treated as HZ-GB and turned into a siingle replacement character

Project Member Reported by remoun@google.com, Aug 15 2016

Issue description

UserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2816.0 Safari/537.36

Example URL:

Steps to reproduce the problem:
1. Load an HTML page (over http: or file:) with "~{", e.g. "~{{foo}}"

What is the expected behavior?
Renders the content. In the case of an Angular page, runs JS and applies the client-side templates.

What went wrong?
Renders "�", regardless of the content of the file (2 characters or 50kb of Angular code). Screenshot and minimal test case attached.

Does it occur on multiple sites: N/A

Is it a problem with a plugin? No 

Did this work before? Yes Stable (M 52)

Does this work in other browsers? Yes 

Chrome version: 54.0.2816.0  Channel: beta
OS Version: 
Flash Version: Shockwave Flash 23.0 r0

I couldn't reproduce this with jsfiddle nor data URIs.
 
b.html
19 bytes View Download
PkhNzRc8kym.png
38.3 KB View Download
Cc: jchaffraix@chromium.org
Labels: -Pri-2 -Type-Compat M-54 Needs-Bisect Pri-1 Type-Bug-Regression
Per discussion with Remoun, this is a regression on M54 from M52 (thus changing priority).
Labels: allpublic

Comment 3 by remoun@google.com, Aug 15 2016

This seems to be a bug in Chrome's encoding auto-detection: disabling "Auto Detect" in More tools > Encoding, the page loads.
Cc: -jchaffraix@chromium.org
Labels: -Needs-Bisect ReleaseBlock-Stable hasbisect OS-Mac OS-Windows
Owner: jinsuk...@chromium.org
Status: Assigned (was: Unconfirmed)
Able to reproduce the issue on Mac 10.11.6,Win 10 and Ubuntu 14.04 using latest canary 54.0.2830.0.

Bisect info:
============
Good: 54.0.2803.0
Bad : 54.0.2804.0/54.0.2805.0

CHANGELOG URL:
----------------
https://chromium.googlesource.com/chromium/src/+log/79f7b784a97cbb22f11064a05b621b0def87eab3..f0829bf6d80a9109b399580fe48d8c3e1c66eeed

Possible suspect : https://codereview.chromium.org/1894913002
jinsukkim@ : Could you please take a look into this if its related to your change.

Added ReleaseBlock-Stable as its a recent regression issue,please modify if not appropriate.
Cc: jchaffraix@chromium.org
Project Member

Comment 6 by sheriffbot@chromium.org, Aug 16 2016

Labels: Hotlist-Google
Status: Started (was: Assigned)
On it now.
Status: Fixed (was: Started)
The fix landed https://crrev.com/2253803003 
Project Member

Comment 9 by bugdroid1@chromium.org, Aug 17 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/92c0a07c47baac2ae7735338308ee2a5458c8b9c

commit 92c0a07c47baac2ae7735338308ee2a5458c8b9c
Author: jinsukkim <jinsukkim@chromium.org>
Date: Wed Aug 17 06:45:37 2016

Ignore 7-bit encodings

The only 7-bit encoding WHATWG standard supports is ISO-2022-JP.
Ignore the othe 7-bit encoding detection results and return
US-ASCII as the result.

BUG= 638001 

Review-Url: https://codereview.chromium.org/2253803003
Cr-Commit-Position: refs/heads/master@{#412463}

[modify] https://crrev.com/92c0a07c47baac2ae7735338308ee2a5458c8b9c/third_party/WebKit/Source/platform/blink_platform.gypi
[modify] https://crrev.com/92c0a07c47baac2ae7735338308ee2a5458c8b9c/third_party/WebKit/Source/platform/text/TextEncodingDetector.cpp
[add] https://crrev.com/92c0a07c47baac2ae7735338308ee2a5458c8b9c/third_party/WebKit/Source/platform/text/TextEncodingDetectorTest.cpp

Cc: -jchaffraix@chromium.org
For future reference: "~{" is a common prefix pattern appearing in the text encoded in HZ-GB-2312. CED, a new encoding detector, was making its best guess and detecting the encoding as such. But HZ is not supported in WHATWG encoding standard along with other 7-bit encodings such as ISO-2022-KR/CN and UTF7. The fix was made so that the input texts that are detected to be encoded in those encoding are treated as US-ASCII.
Labels: TE-Verified-M54 TE-Verified-54.0.2832.2
Thank you for the Quick Fix..!
Verified the issue on Ubuntu 14.04,Mac 10.11.6 and Windows 10 using 54.0.2832.2 and its working fine now.Please find the screen cast(win) attached for the same.
638001_Aug_19.mp4
201 KB View Download

Comment 12 by js...@chromium.org, Sep 16 2016

Labels: -OS-Linux -OS-Windows -OS-Mac OS-All
Summary: Unlabelled US-ASCII pages with ~{ : are treated as HZ-GB and turned into a siingle replacement character (was: Rendering error with ~{)
https://github.com/whatwg/encoding/issues/68#issuecomment-239099089  talked about this case (not exactly the same but basically identical). 

BTW, I filed  bug 647582  on dropping ISO-2022-JP detection. 
 

Sign in to add a comment