Apply IDNA ToASCII even when the input is ASCII |
||||||
Issue descriptionIf ToASCII is not applied to all input (even ASCII) rules are not uniformly enforced. E.g., it means that Unicode labels can be 63 code points after conversion to Punycode, whereas ASCII labels have no limit. Making it uniform likely requires removing some rules for non-ASCII input, as the web depends on being able to place hyphens in the 3rd and 4th place of a label, and likely also depends on leading and trailing hyphens. An update to Unicode's UTS #46 likely makes more of this configurable: http://www.unicode.org/reports/tr46/tr46-18.html. https://url.spec.whatwg.org/#idna already requires UseSTD3ASCIIRules and VerifyDnsLength to be set to false. I propose that the URL Standard also sets CheckHyphens (needed for compatibility) and CheckJoiners (seems silly to restrict a subset of emojis) to false and continues to require applying ToASCII (domain to ASCII as the URL Standard calls it) to all input. Tests: https://github.com/w3c/web-platform-tests/pull/5976.
,
May 23 2017
I don't concretely understand what needs to be done here. Would it be best to wait for the new tests to land?
,
May 23 2017
I think it would be best to review the tests and give feedback with respect to your thoughts. The main issue is that currently Chrome (and other browsers) have an "ASCII fast path" of sorts that leads to all kinds of inconsistencies.
,
Jun 1 2017
VerifyDnsLength is enforced inside icu. - UTS46::nameToASCII() sets UIDNA_ERROR_DOMAIN_NAME_TOO_LONG - UTS46::process() sets UIDNA_ERROR_LABEL_TOO_LONG It looks we cannot turn it off. Given the "domain to ASCII" in https://url.spec.whatwg.org/#idna sets it to false, we need to change it or implement by ourselves. For all ASCII data, DoSimpleHost() is used which doesn't have the limit. The same reason for empty labels being disallowed. --- uidna_openUTS46() is not called with UIDNA_USE_STD3_RULES. --- Turning off leading/trailing hyphen validation is not yet done on the spec side? https://github.com/jsdom/whatwg-url/pull/90 --- Not sure why setting .host to 'xn--a' is not rejected.
,
Jun 1 2017
,
Jun 25 2017
> Turning off leading/trailing hyphen validation is not yet done on the spec side? That change has been made and deployed. The URL you point to is the repository of a JavaScript implementation of the standard.
,
Jun 29 2017
,
Feb 8 2018
,
Sep 10
,
Sep 10
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by phistuck@chromium.org
, May 18 2017Status: Untriaged (was: Unconfirmed)