New issue
Advanced search Search tips

Issue 689344 link

Starred by 1 user

Issue metadata

Status: Duplicate
Merged: issue 679294
Owner: ----
Closed: Feb 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Garbled text on http://www.hozugawa.net/

Project Member Reported by tkent@chromium.org, Feb 7 2017

Issue description

Chrome Version: 58 canary
OS: All

What steps will reproduce the problem?
(1) Open http://www.hozugawa.net/winapi/struct_WIN32_FIND_DATA.html

What is the expected result?
a) You see Japanese text such as:
   [アルファベット順] [掲載順] [DLL順] [カテゴリ順] [WindowsAPI関数集]
   WIN32_FIND_DATA構造体
or
b) You see garbled text on load, but you can see Japanese text by choosing 'Shift JIS' from an encoding menu.


What happens instead?
You see garbled text, and Google Chome has no encoding menu.

Please use labels and text to provide additional information.
Reported in https://twitter.com/alohakun/status/828822002798465024 .

The page has
  Content-Type: text/html; charset=UTF-8
in HTTP headers, and
  <META http-equiv="Content-Type" content="text/html; charset=Shift_JIS">
in HTML header.

In this case, blink::TextResourceDecoder doesn't use TextEncodingDetector because of m_source==EncodingFromHTTPHeader.

Possible fix would be we always use TextEncodingDetector for HTML-like resource.  CompactEncDet::DetectEncoding() has arguments for encoding-from-http-header and encoding-from-meta-charset.


 

Comment 1 by tkent@chromium.org, Feb 7 2017

> Possible fix would be we always use TextEncodingDetector for HTML-like resource.  CompactEncDet::DetectEncoding() has arguments for encoding-from-http-header and encoding-from-meta-charset.

It might cause security issues.

Seen sites like this before (mismatch between from http header vs. actual encoding) This can be worked around by the encoding extension mentioned https://bugs.chromium.org/p/chromium/issues/detail?id=597488#c70, though Chrome Android cannot benefit from it.

Maybe we just go by the principle https://en.wikipedia.org/wiki/Garbage_in,_garbage_out ?

Comment 3 by jsb...@chromium.org, Feb 15 2017

Is there some volume of feedback that would cause us to rethink  issue 597488  (removing the encoding menu) ?

Maybe we should just dupe these to that issue, or (better?) create a new meta issue to dupe these against.
> Is there some volume of feedback that would cause us to rethink  issue 597488  (removing the encoding menu) ?

That would be something to consider if there are enough issues that cannot be addressed by the extension. Only issue I'm aware of so far is  Issue 679294 . AFAIK the extension overrides the encoding by replacing the HTTP header. I guess that's why it won't work for local files.

tkent@: Could you give your thoughts on this? I wonder if there'll also be security implication for always running encoding detector for local files only?


Comment 5 by tkent@chromium.org, Feb 15 2017

Mergedinto: 679294
Status: Duplicate (was: Untriaged)
I filed this because I thought we might be possible to fix this by passing more arguments to CED, like  Issue 682978 .  However my conclusion is we should not ignore charset= in an HTTP header, and the extension can help such cases.

I close this.  Also, I guess the CL for  Issue 682978  improved local file detection very much.

Sign in to add a comment