TextCodecUTF8 and TextCodecUTF16 register nonstandard labels |
||||
Issue descriptionThe Encoding Standard specifies what labels should be supported: https://encoding.spec.whatwg.org/#names-and-labels We have extras for a handful of codecs: TextCodecUTF16::RegisterEncodingNames: * csunicode * ucs-2 * unicode * iso-10646-ucs-2 * unicodefeff * unicodefffe TextCodecUTF8::RegisterEncodingNames: * unicode11utf8 * unicode20utf8 * x-unicode20utf8 These are web-exposed, e.g. navigate to: data:text/html;charset=unicode11utf8,<script>document.write(document.characterSet)</script> Expected: windows-1252 Actual: UTF-8 Or run this on the console: new TextDecoder('unicode11utf8').encoding Expected: throws Acutal: 'utf-8' We should remove or standardize these additional labels.
,
Jul 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c998cf3857d76d0773b330adb2d9098453c53050 commit c998cf3857d76d0773b330adb2d9098453c53050 Author: Joshua Bell <jsbell@chromium.org> Date: Thu Jul 27 23:40:53 2017 Text Encodings: Add test comparing supported vs. specified encodings Adds a new window.internals API to get the list of supported encoding labels, since that is not web-exposed. A test compares that against the set of encoding labels from the Encoding Standard [1] using a resource file from web-platform-tests. We support all of the standardized labels, but have some extras: * Deviations from the standard for GBK/GB18030 (crbug.com/339862) * Extra UTF-8 aliases from TextCodecUTF8 (crbug.com/747562) * Extra UTF-16 aliases from TextCodecUTF16 (crbug.com/747562) * '-html' suffix aliases for standard encodings ( crbug.com/747558 ) [1] https://encoding.spec.whatwg.org/ Change-Id: I165b6c2aed2595cb9a87bd148a322b488087ff85 Bug: 339862, 747558 ,747562 Change-Id: I165b6c2aed2595cb9a87bd148a322b488087ff85 Reviewed-on: https://chromium-review.googlesource.com/581936 Commit-Queue: Joshua Bell <jsbell@chromium.org> Reviewed-by: Jungshik Shin <jshin@chromium.org> Reviewed-by: Kent Tamura <tkent@chromium.org> Cr-Commit-Position: refs/heads/master@{#490119} [add] https://crrev.com/c998cf3857d76d0773b330adb2d9098453c53050/third_party/WebKit/LayoutTests/fast/encoding/supported-encodings-expected.txt [add] https://crrev.com/c998cf3857d76d0773b330adb2d9098453c53050/third_party/WebKit/LayoutTests/fast/encoding/supported-encodings.html [modify] https://crrev.com/c998cf3857d76d0773b330adb2d9098453c53050/third_party/WebKit/Source/core/testing/Internals.cpp [modify] https://crrev.com/c998cf3857d76d0773b330adb2d9098453c53050/third_party/WebKit/Source/core/testing/Internals.h [modify] https://crrev.com/c998cf3857d76d0773b330adb2d9098453c53050/third_party/WebKit/Source/core/testing/Internals.idl [modify] https://crrev.com/c998cf3857d76d0773b330adb2d9098453c53050/third_party/WebKit/Source/platform/wtf/text/TextEncodingRegistry.cpp [modify] https://crrev.com/c998cf3857d76d0773b330adb2d9098453c53050/third_party/WebKit/Source/platform/wtf/text/TextEncodingRegistry.h
,
Aug 25 2017
Note that WebKit still has the UTF-8 and UTF-16 aliases: https://github.com/WebKit/webkit/blob/5277f6fb92b0c03958265d24a7692142f7bdeaf8/Source/WebCore/platform/text/TextCodecUTF8.cpp https://github.com/WebKit/webkit/blob/5277f6fb92b0c03958265d24a7692142f7bdeaf8/Source/WebCore/platform/text/TextCodecUTF16.cpp ("Perhaps we can prove some are not used on the web and remove them." is noted with the UTF-8 ones.) I didn't have Edge handy, but IE11 does not support the UTF-8 aliases (the UTF-16 aliases are harder to test; you also can't use the data: trick with IE/Edge)
,
Aug 29 2017
If IE 11 does not support any of UTF-8 aliases, I'd not worry about Edge (the chance of Edge supporting them is almost zero). As for UTF-16 aliases, the share (and even the number) of UTF-16 page is extremely small to start with. Multiplying that with the chance of them using 'interesting' labels listed here, we'd talk about a negligible (if non-zero) number of web pages. We can get the stats, but I don't think it's worth our time. Anyway, keeping them or not, is not terribly important. Either way is fine (standardizing them or removing them to be compliant with the current standard).
,
Aug 30
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Sep 6
,
Sep 7
,
Dec 13
|
||||
►
Sign in to add a comment |
||||
Comment 1 by js...@chromium.org
, Jul 25 2017