New issue
Advanced search Search tips

Issue 703006 link

Starred by 3 users

Issue metadata

Status: Duplicate
Merged: issue 704800
Owner: ----
Closed: Mar 2017
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Compat



Sign in to add a comment

No way to see UTF-8 file

Reported by jidanni@gmail.com, Mar 20 2017

Issue description

UserAgent: Mozilla/5.0 (X11; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0

Example URL:
file:///tmp/r

Steps to reproduce the problem:
1. Save the attchment to /tmp/r
2. browse file:///tmp/r
3. 

What is the expected behavior?
See the file in UTF-8 charset

What went wrong?
Charset guessed wrong and no way to chose other charset.

Does it occur on multiple sites: N/A

Is it a problem with a plugin? No 

Did this work before? Yes At least there used to be ways to override it.

Does this work in other browsers? Yes

Chrome version: <Copy from: 'about:version'>  Channel: n/a
OS Version: 
Flash Version: Shockwave Flash 24.0 r0

Other browsers have ways to override incorrect charset guesses.
 

Comment 1 by jidanni@gmail.com, Mar 20 2017

Version 57.0.2987.98 built on Debian 9.0, running on Debian 9.0 (64-bit)

Comment 2 by jidanni@gmail.com, Mar 20 2017

LANG=en_US.UTF-8
LANGUAGE=en_US:en
Labels: Needs-Triage-M57
Mergedinto: 597488
Status: Duplicate (was: Unconfirmed)

Comment 5 by jleedev@gmail.com, Mar 20 2017

Is the extension supposed to handle file URLs as well?

Comment 6 by jidanni@gmail.com, Mar 20 2017

All I know is a most browsers these days can handle UTF-8, since the 90s.

Also we are working offline with file:/// URLs. We do not  / cannot go online to download additional components.

Comment 7 by jidanni@gmail.com, Mar 20 2017

If the browser cannot support Unicode / UTF-8 out of the box, then it will lose its Unicode Certification, if any.
Can you upload the screenshot of what you're seeing (i.e. the result of wrong charset)? 

Attached file is what I got. Is this not the expected output?
utf8.png
15.4 KB View Download

Comment 9 by jidanni@gmail.com, Mar 21 2017

Yes you got Unicode. I got a "western" looking charset which I will upload here when I get home.
I have the same issue.
Also it reproduces with cyrillic and other characters.

Windows 10 pro x86_64, russian locale, Chrome 57.0.2987.110
chrome-wrong-encoding.png
51.1 KB View Download

Comment 11 by myfonj@gmail.com, Mar 23 2017

Same issue, garbled differently, supposedly due locale, screenshot attached.

Windows 10 pro x86_64, cs-CZ locale, Chrome 57.0.2987.110 (Official Build) (64-bit)

Seems that any random text editor I have installed has better charset detection than todays Google Chrome.
chrome-57-win10-cs-CZ-wrong-encoding.png
14.7 KB View Download

Comment 12 by jleedev@gmail.com, Mar 23 2017

There's a similar discussion in Firefox from a couple years ago: https://bugzilla.mozilla.org/show_bug.cgi?id=1071816

Comment 13 by jidanni@gmail.com, Mar 25 2017

I get the same as https://bugs.chromium.org/p/chromium/issues/attachment?aid=276585 with pristine chromium,
LANG=en_US.UTF-8
USER=nobody
LANGUAGE=en_US:en

Comment 14 by jidanni@gmail.com, Mar 25 2017

https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
The Unicode Standard permits the BOM in UTF-8,[3] but does not require or recommend its use.[4] 
Mergedinto: -597488 704800
Sorry I didn't get notified of the updates due to not being cc'ed. This is a bug and a fix is in review. Please see  Issue 704800 

Sign in to add a comment