Support for ISO-2022-JP encoding
Reported by
addisoni...@gmail.com,
Jul 7 2016
|
|||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0 Steps to reproduce the problem: The W3C I18N WG in concert with WHATWG is testing Encoding specification support. For the encoding ISO-2022-JP, our tests produce 903 errors in Chrome. In addition, two of the decode tests (out of 8 total) failed. See: https://github.com/whatwg/encoding/issues/60 What is the expected behavior? No errors detected What went wrong? See encoding bug above. Please respond to that issue in whatwg's github. Did this work before? N/A Chrome version: <Copy from: 'about:version'> Channel: n/a OS Version: Flash Version: Shockwave Flash 22.0 r0 Note: this may be an error in the specification or an error in the tests, in which case we'd very much like to know! [Filed on behalf of W3C I18N WG]
,
Jul 25 2016
,
Sep 16 2016
form/href-encoding-misc: Out of 93, 30 characters(Cf, default ignorable) share the same cause as the same failure in Shift_JIS, EUC-KR, etc (see bug 647568 ) The rest seems to be half-width Katakana. I remember raising an issue with this somewhere (and taking action), but I couldn't find it. form/href-encoding: 373 characters. Mostly CJK Ideographs. Chromium treats them as not covered by ISO-2022-JP. Need to investigate. ISO-2022-JP in ICU (used by Chrome) share the table with Shift_JIS (and Shift_JIS in chromium passes the tests).
,
Sep 18 2017
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. If you change it back, also remove the "Hotlist-Recharge-Cold" label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Sep 18 2017
,
Sep 19 2017
Recent results show that there's still a problem here. See https://www.w3.org/International/tests/repo/results/encoding-dbl-byte.en#iso2022jp
,
Nov 20 2017
Perhaps unsurprisingly, some these failing cases are characters with Unicode canonical or compatibility mappings to other characters. Perhaps there's either an added layer of normalization in one of the codecs that is missing in the other? e.g. 礼 (which is suffering numeric character reference replacement) is canonically equivalent to 礼 (which works fine, but is encoded separately in ISO-2022-JP) another example: ¦ (which is likewise suffering numeric character reference replacement) has a compatibility decomposition to ¦ (which in this case likewise is suffering numeric character reference replacement)
,
Nov 6
Fixed now? |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by ajha@chromium.org
, Jul 12 2016