New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 650760 link

Starred by 3 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Android , Windows , iOS , Chrome , Mac , Fuchsia
Pri: 3
Type: Bug

Blocked on:
issue 351639

Blocking:
issue 630667


Participants' hotlists:
Url-Audit


Sign in to add a comment

FormatUrlForSecurityDisplay should show Unicode hostnames when regarded as safe per IDN display policy

Project Member Reported by js...@chromium.org, Sep 27 2016

Issue description

FormatUrlForSecurityDisplay / FOrmatOriginForSecurityDisplay should show Unicode when it's regarded as 'safe' per our IDN display policy. 

elide_url.h even has a comment to that effect (and I thought I had done that back in April [1]). 

// Internationalized domain names (IDN) may be presented in Unicode if
// they're regarded safe. See |url_formatter::FormatUrl| for more details on
// the algorithm).

A change would be as simple as calling IDNToUnicode over 'host' in those two functions above. (similar to what's done at https://codereview.chromium.org/1861213002 . see also  bug 593177  )



[1] https://codereview.chromium.org/1841653003 
 

Comment 1 by js...@chromium.org, Sep 27 2016

Blocking: 630667

Comment 2 by js...@chromium.org, Sep 27 2016

Cc: -ellyjo...@chromium.org rsleevi@chromium.org

Comment 3 by js...@chromium.org, Sep 27 2016

Cc: -rsleevi@chromium.org ellyjo...@chromium.org
Ick. I didn't mean to drop ellyjones@ (we really need to fix 'Cc' box UI in monorail :-)). 

Comment 4 by js...@chromium.org, Sep 27 2016

Cc: rsleevi@chromium.org

Comment 5 by js...@chromium.org, Sep 27 2016

https://goo.gl/5YgzVQ : I went through callers of FormatUrlForSec.. and didn't find anything that would need punycode. 

https://goo.gl/2ByqYg : ditto for callers of FormatOriginForSec... 

They're all like a case handled in  bug 593177 . 

The only potential source of concern is BiDi urls. 

Comment 6 by js...@chromium.org, Sep 27 2016

> The only potential source of concern is BiDi urls.

One possibility is to enclose the whole url with Bidi formatting characters (specifically LRI and PDI) to for that url to be always in LTR direction regardless of the surrounding text and overall text direction (LTR or RTL). 

Comment 7 by js...@chromium.org, Sep 27 2016

https://codereview.chromium.org/2375803002 is a CL, but it does not add LRI and PDI. I need to think a bit more about them. 

http://unicode.org/reports/tr9/#Directional_Formatting_Characters

Comment 8 by js...@chromium.org, Sep 28 2016

Well, enclosing the whole url with LRI and PDI does not help with "http://ABC.DEF.com" (logical order. Uppercase for RTL and lowercase for LTR) that will be visually rendered with "http://FED.CBA.com". I want "http://CBA.FED.com". 

It'd work if individual domain label (ABC, DEF) are enclosed with {FSI, PDI}. 

How about query and path?  
Owner: js...@chromium.org
Status: Assigned (was: Untriaged)
Going to mark this as assigned to jshin@ since they're working on it :)

Comment 10 by js...@chromium.org, Sep 28 2016

re: "http://FED.CBA.com" vs http://CBA.FED.com

there's a disagreement on that. Currently, Chrome's omnibox (desktop Chrome) always display URLs in LTR (regardless of the UI language) and I think it'll have "http://FED.CBA.com" (visual) for 'http://ABC.DEF.com' (logical). 

See also http://www.macchiato.com/bidi-url

So, just enclosing the entire url with {LRI, PDI} may better match the omnibox (which is a good thing (TM)) although it can be confusing depending on how things are looked at. 

Anyway, I've updated the CL to punt BiDi URL issues by continuing to show punycode if there's any RTL characters in the domain name for Format*ForSecurityDisplay. 

The FED.CBA.com vs CBA.FED.com is something I've thought a lot about. Right now, the spec(s) are quite clear on the "correct" behaviour:

https://tools.ietf.org/html/rfc3987#section-4.1
> Bidirectional IRIs MUST be
> rendered in the same way as they would be if they were in a
> left-to-right embedding; i.e., as if they were preceded by U+202A,
> LEFT-TO-RIGHT EMBEDDING (LRE), and followed by U+202C, POP
> DIRECTIONAL FORMATTING (PDF)."

https://url.spec.whatwg.org/#url-rendering
> For the purposes of bidirectional text it should be rendered as if
> it were in a left-to-right embedding.

So I have gone through and updated all the Omnibox implementations to match the specs, so they should all render consistently now. I'm not satisfied that this behaviour is good, and I want to change it, but I think we should remain consistent for the time being.

My thoughts on this:
https://docs.google.com/presentation/d/1kZZ7ZqNCkGOU4y296izNiCvf8CivStxl7aYHvFNHZ-c/edit

> It'd work if individual domain label (ABC, DEF) are enclosed with {FSI, PDI}.

That's more or less my proposal.

Can you explain where Format*ForSecurityDisplay is used in Chrome? I could go either way on either a) keeping the punycode, or b) just having it display the same way as the Omnibox. My main concern is if it gets displayed somewhere in a security context with no black highlight to separate the domain from the path. It's sensible/conservative to keep the punycode for now.
Blockedon: 351639
btw, the existing bug for dealing with RTL URL rendering is Issue 351639.
From briefly perusing codesearch, it looks like the output of FormatUrlForSecurityDisplay is used in a lot of places where there is no black/grey highlighting (e.g., lots of confirmation dialogs). It can also display URLs without a scheme (which enables a lot more shenanigans as we've seen in the omnibox), so I agree it's probably safe for now to not allow RTL characters in there.
Thanks a lot for the collection of links/bugs as well docs with a lot of analysis and thoughts. I've been to your doc/proposal and bug 351639. [1] 

I was too lazy to look up and quote RFC 3987 as well as your docs.  BTW, RFC 3987 was written before the addition of 'bidi isolate' formatting characters to the Unicode and needs to be revised because using {LRI,PDI} (or equivalent tags in html)  would be better in virtually all cases than {LRE,PDF} ( http://unicode.org/reports/tr9/ ) as long as they're supported. 

> > It'd work if individual domain label (ABC, DEF) are enclosed with {FSI, PDI}.

> That's more or less my proposal.

Yeah. That seems "nice and clean" to "my eyes" :-). I wish Chrome could unilaterally makes a change. Obviously, it cannot. It's hard to know how users are "trained" to read Bidi urls (whether and how much they're already accustomed to the way in RFC 3987). Some UX studies and coordinated efforts by browser vendors and other parties are necessary.

[1] A digression: Omnibox being used for both URL display and query input presents a bit of challenge because the current omnibox behavior in desktop Chrome, when used as a 'search text box', is a bit unexpected for RTL users in that it's different from that of a text input box at many places including google.com as you are aware.) 
Project Member

Comment 15 by bugdroid1@chromium.org, Oct 6 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/78809c4d824d07d465be71140c7951f3af84f023

commit 78809c4d824d07d465be71140c7951f3af84f023
Author: jshin <jshin@chromium.org>
Date: Thu Oct 06 20:15:45 2016

Use Unicode in Format{Origin,URL}forSecurityDisplay

Use Unicode instead of punycode for 'host' part of URL
in Format{Origin,URL}ForSecurityDisplay. This would only change
the way IDN(internationalized domain name) is displayed.

To be very conservative, continue to use punycode if there's any RTL
characters in the domain name. See the bug for more details on this issue.

Besides, change IDNToUnicode() to accept StringPiece instead of
|const string&|.

BUG=650760
TEST=components_unittests --gtest_filter=Format*Security*

Review-Url: https://codereview.chromium.org/2375803002
Cr-Commit-Position: refs/heads/master@{#423646}

[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/BUILD.gn
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/elide_url.cc
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/elide_url.h
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/elide_url_unittest.cc
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/url_formatter.cc
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/url_formatter.h

Project Member

Comment 16 by bugdroid1@chromium.org, Oct 27 2016

Labels: merge-merged-2840
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/78809c4d824d07d465be71140c7951f3af84f023

commit 78809c4d824d07d465be71140c7951f3af84f023
Author: jshin <jshin@chromium.org>
Date: Thu Oct 06 20:15:45 2016

Use Unicode in Format{Origin,URL}forSecurityDisplay

Use Unicode instead of punycode for 'host' part of URL
in Format{Origin,URL}ForSecurityDisplay. This would only change
the way IDN(internationalized domain name) is displayed.

To be very conservative, continue to use punycode if there's any RTL
characters in the domain name. See the bug for more details on this issue.

Besides, change IDNToUnicode() to accept StringPiece instead of
|const string&|.

BUG=650760
TEST=components_unittests --gtest_filter=Format*Security*

Review-Url: https://codereview.chromium.org/2375803002
Cr-Commit-Position: refs/heads/master@{#423646}

[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/BUILD.gn
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/elide_url.cc
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/elide_url.h
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/elide_url_unittest.cc
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/url_formatter.cc
[modify] https://crrev.com/78809c4d824d07d465be71140c7951f3af84f023/components/url_formatter/url_formatter.h

Comment 17 by dimu@google.com, Nov 4 2016

Labels: -merge-merged-2840
[Automated comment] removing mislabelled merge-merged-2840
Cc: mgiuca@chromium.org
+mgiuca
Cc: lgar...@chromium.org est...@chromium.org
Dom has reminded me about this. It's a little concerning, given my above statement in #13:

> I agree it's probably safe for now to not allow RTL characters in there

It isn't great from a security standpoint that we show punycoded domains in some contexts which show in Unicode in other contexts.

For example, one context FormatUrlForSecurityDisplay is used is in permission bubbles when a site asks for a permission. e.g.: "FormatUrlForSecurityDisplay(origin) wants to use geolocation". If the site has an RTL domain label, we will show it in punycode in the permission bubble, while it's shown properly in the Omnibox. That means the user sees a totally meaningless origin in the permission bubble and is unable to know that it's the same origin as the page they are on.

So it would be good to enable Unicode display for RTL, except due to Issue 351639, RTL URLs are very confusing to parse. But since we render RTL URLs in the Omnibox (the primary security surface), then surely it is OK to also show it in other security surfaces? Security people should weigh in. +estark +lgarron
Cc: -palmer@chromium.org
Labels: -OS-All OS-Android OS-Chrome OS-Fuchsia OS-iOS OS-Linux OS-Mac OS-Windows
Cc: js...@chromium.org cthomp@chromium.org rorymcclelland@chromium.org
Owner: ----
Status: Available (was: Assigned)
+rorymclelland@ from the permissions team since this affects permissions dialogs. Also, since we have shipped Permission Delegation this may affect how we think about permission dialogs :-). (Handling RTL IDN in FormatUrlForSecurityDisplay affects other UI besides permission prompts, but it is a common/important one.)

We (Enamel) are planning to revisit this to see if we can better align with the omnibox display. RTL IDN punycode is cropping up more as we audit URL display surfaces.

Sign in to add a comment