New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 770270 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Charset of file-based LayoutTests default to UTF-8 while HTTP tests do not

Project Member Reported by robertma@chromium.org, Sep 29 2017

Issue description

content_shell treats all file-based tests as in UTF-8 charset by default, but does not default to UTF-8 when it runs HTTP tests. i.e., when a URL is given to `content_shell --run-layout-tests`, the default charset is not UTF-8.

This also includes all WPT tests, because they are served by WPTServe and content_shell receives URLs instead of file paths.

Here is the likely piece of code that makes UTF-8 the default charset for file-based tests:
https://cs.chromium.org/chromium/src/content/shell/browser/shell.cc?l=232&rcl=84e6c2d657fc4b56d6736eaacae55342058729c1

I'm not sure what is the right behavior (according to the spec), but I think the inconsistency between file-based and HTTP tests is counter-intuitive and error-prone.
 
Summary: Charset of file-based LayoutTests default to UTF-8 while HTTP tests do not (was: Charset of HTTP/WPT LayoutTests does not default to UTF-8)
Cc: mkwst@chromium.org
Mike, is the `<meta http-quiv="set-cookie" ...>` deprecation logging using the correct charset, or should it be changed to avoid outputting raw UTF-8 to the console when Chrome is run in a non-UTF 8 locale?
> Here is the likely piece of code that makes UTF-8 the default charset for file-based tests:
> https://cs.chromium.org/chromium/src/content/shell/browser/shell.cc?l=232&rcl=84e6c2d657fc4b56d6736eaacae55342058729c1

Sorry this is not relevant.

> Mike, is the `<meta http-quiv="set-cookie" ...>` deprecation logging using the correct charset, or should it be changed to avoid outputting raw UTF-8 to the console when Chrome is run in a non-UTF 8 locale?

I ran the test in a UTF-8 locale, but the error message was still garbled (identical to what you see on try bots). And it is not legal UTF-8.

For example, the cookie 🍪 character (U+1f36a),
it ought to be F0 9F 8D AA in UTF-8 (four bytes),
but it showed up as C3 B0 C2 9F C2 8D C2 AA (eight bytes).
(It's a symptom that the UTF-8 string is treated as latin-1 and encoded to UTF-8 again.)

As reported by bsittler@ in the email, charset of the page has been explicitly declared as UTF-8 when the garbled output is observed. FWIW, if I use JS Console APIs to print these characters (charset also set to UTF-8), they show up correctly in the result txt. Hence, at least the test runner and run_webkit_tests scripts seem to handle the strings correctly. Some double encoding seems to happen internally where the error is generated. If that's the case, the encoding issue in the `<meta http-quiv="set-cookie" ...>` test might be independent of this bug (which is more about the inconsistency between file-based and HTTP tests).
robertma@, do you have any sense of how many tests are affected by this? Is it enough of a problem to be P2? (Leaving untriaged for now.)
foolip@: I think the major concern is if someone starts with a file-based (which implies non-WPT) layout test and uses UTF-8 characters in the test but no <meta charset> tag, the test would still work. Yet when the test is later ported to HTTP tests, including WPT, encoding will break.

I did a quick grep and it seems we are all good at the moment. The only files with non-ASCII chars and no charset declaration have the special chars in <link author> only, which should not be a problem.

Not a big concern. Diligent code review can spot the missing <meta charset>, and hopefully the error will also surface during the porting process. P3 is appropriate.
Status: Available (was: Untriaged)
Great. If this does come up as a common pain point, I guess we could do a lint of some sort, which might be quicker than a real fix.
Project Member

Comment 7 by sheriffbot@chromium.org, Oct 5

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: nednguyen@chromium.org
Status: Available (was: Untriaged)
nednguyen@, here's an example of a small but annoying difference between LayoutTests and WPT.

Sign in to add a comment