New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 769547 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , iOS , Chrome , Mac , Fuchsia
Pri: 3
Type: Bug



Sign in to add a comment

IDN sub-domain of top-domains is displayed as punycode

Project Member Reported by js...@chromium.org, Sep 27 2017

Issue description

How to reproduce:

Type these two domains in omnibox and try to navigate to it. 

   한.google.com   : '한' is displayed as punycode.
   한.notgoodle.com  : '한' is displayed as Unicode (regular character)


The reason for this inconsistency is that we use punycode whenever the eTLD+1 portion of a hostname matches one of top domains in terms of 'similarity skeleton'.   To save space, we don't store the original names of top 10k domains, but only skeleton. So, the original name (google.com) matches itself and IDN-subcomponent is displayed as punycode. 

There's a (bit hackish) way to avoid that assuming that all top domains are non-IDN (which is the current assumption anyway). 




 
Cc: mea...@chromium.org
Cc: -mea...@chromium.org js...@chromium.org
Owner: mea...@chromium.org
We now store the domain that matches a given skeleton which makes it possible to fix this: If the domain we are checking has the same eTLD+1 as the top domain we found, we can fall back allow unicode.

Sign in to add a comment