New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 2 users
Status: Fixed
Closed: Jun 9
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug-Security

Sign in to add a comment
Near homograph URL Spoofing with Arabic
Reported by, Jun 6 Back to list (does not show in punnycode)

What went wrong?
By adding this *ِ* (notice the weird thing under asterisk) we can actually spoof the URL (espicially the inexperienced users)

More info:

18.3 KB View Download
Components: UI>Security>UrlFormatting Blink>JavaScript>Internationalization
Summary: Near homograph URL Spoofing with Arabic (was: URL Spoofing )
This one looks reasonably compelling. It's not clear that .COM's registration policy would allow this domain to be registered though.
Components: -Blink>JavaScript>Internationalization
not js related
Try: for the registration.
Components: UI>Internationalization
Labels: Pri-1
Status: Untriaged
Indeed, this spoofing domain is live. 
Status: Assigned
jshin@, could we blacklist the character that produces the tick?
This could be a special case of  Issue 726950  (mixing different scripts).

However, because U+0650 is in the "Mark, Nonspacing" category, it raises the question why punctuation marks like this aren't being universally blocked from appearing in URLs.
Re #6: Maybe I'm missing something, but this isn't punctuation, is it?

Nonspacing mark: A combining character with the General Category of Nonspacing
Mark (Mn) or Enclosing Mark (Me).
• The position of a nonspacing mark in presentation depends on its base character. It generally does not consume space along the visual baseline in and of itself.
• Such characters may be large enough to affect the placement of their base character relative to preceding and succeeding base characters. For example, a circumflex applied to an “i” may affect spacing (“î”), as might the character U+20DD combining enclosing circle.

Perhaps this is more similar to Issue 727092 ?
elawrence@ is right. This is not a punctuation. 

Yes, it's similar to issue 727092, but fixing that one wouldn't block this one. 

U+0650 has ScriptExtension=Arabic and Syriac even though Script is Inherited.
bug 727092 is about ScriptExtension={Common,Inherited}. 

So, if we disallow mixing of Latin with any script other than {CJK, Common, Inherited} based on ScriptExtension property, this one would be blocked. That is  bug 726950 .  Currently, we block mixing of Latin + any scripts other than Greek/Cyrillic (and a few more) bsaed on ScriptExtension values. 

And, this can be registered in Verisign controlled domains because its script mixing rule does not use ScriptExtension but just use Script property. And it allows any characters with Script=Inherited and Script=Common to be mixed with any other script.   Firefox has the same issue because it also does the same as Verisign does. 

And, this one is not blocked by BiDi check, either because its Bidi class is  NSM ( )

  // 5. In an LTR label, only characters with the BIDI properties L, EN,
  // ES, CS, ET, ON, BN and NSM are allowed.

I'm more tempted to switch over to 'strictly restrictive' rules ( bug 726950 ).
An alternative is to just block RTL scripts (Hebrew, Arabic) from mixing with Latin.  (Syriac/Adlam are  disallowed anyway). 

Problematic Arabic NSMs that would crack through various filters: :  
[:Bidi_Class=Nonspacing_Mark:] &  [:Identifier_Statusβ=Allowed:] &   [:ScriptExtensionsβ=Arabic|Syriac:] 

Arabic — Tashkil from ISO 8859-6 items: 8

Arabic — Combining maddah and hamza items: 3

Arabic — Tashkil items: 1


 [:Bidi_Class=Nonspacing_Mark:] &  [:Identifier_Statusβ=Allowed:] &   [:ScriptExtensionsβ=Hebrew:] 


Labels: OS-All
Verisign's Latin script policy ( )  does allow U+0650 and others in the above list except for U+05B4 because its script is Heberew. 

A new similarity check in M60 (diracritic-free + confusability skeleton check) is likely to catch this case against top domains, though. Hmm... it does not. 

abcฺ.com  with Thai character Phinthu (U+03EA) after 'c' : this cannot be registered at .com TLD (both Script and ScriptExtension of U+03EA are Thai), but we allow it (because we allow mixing of Latin and scripts other than Greek/Cyrillic). The risk is pretty low due to Verisign and Thai ccTLD policy.  

Nonetheless, a case has been building up for switching back to 'strictly restrictive' script mixing from 'moderately restrictive' ( bug 726950 ) *unless* we can come up with a clever way to detect 'base + combining mark' sequences where 'base' and 'combing mark' come from two unrelated scripts (e.g. a Latin base letter + Thai/Arabic combining mark). Even better would be to come up with a way to detect 'base + combining mark' sequences that are NOT used in ANY language. That way, even Latin + U+03xx would be blocked if it's not used in any language at all. 

Labels: Security_Severity-Low Security_Impact-Stable is better than what's given in comment 9 (the result is the same, but better matches my intention). 

[:Bidi_Class=Nonspacing_Mark:] &  [:Identifier_Statusβ=Allowed:] &   [:ScriptExtensionsβ=/Arabic/:] 


As for comment 12:  : an example with Thai  : 12 Thai NSM's allowed to mix with Latin by Chrome 
 [:gC=Nonspacing_Mark:] &  [:Identifier_Statusβ=Allowed:] &   [:ScriptExtensionsβ=/Thai/:] : 0 code points - Thai NSM's allowed to mix with Latin by Verisign's rules

 [[:gC=Nonspacing_Mark:] &  [:Identifier_Statusβ=Allowed:] &   [:ScriptExtensionsβ=/Thai/:]] - [:sc=Thai:] 

And, there are a lot of S/SE Asian scripts with NSMs allowed to mix with Latin by Chrome (but not by Verisign). 

> A new similarity check in M60 (diracritic-free + confusability skeleton check) is likely to catch this case against top domains, though. Hmm... it does not. 

The reason it does not is that I skip 'dropping NSM' step (transliteration step) for cases in this bug to speed things up. 

  // If input has any characters outside Latin-Greek-Cyrillic and [0-9._-],
  // there is no point in getting rid of diacritics because combining marks
  // attached to non-LGC characters are already blocked.
  if (lgc_letters_n_ascii_.span(ustr_host, 0, USET_SPAN_CONTAINED) ==

Status: Started is a narrow-range CL to address this issue alone. 

It'd have been better if comments 12, 14, 15 had been posted to  bug 726950  . 

Project Member Comment 18 by, Jun 9
The following revision refers to this bug:

commit 536f72f4eeb63af895ee489c7244ccf2437cd157
Author: Jungshik Shin <>
Date: Fri Jun 09 04:59:19 2017

Disallow Arabic/Hebrew NSMs to come after an unrelated base char.

Arabic NSM(non-spacing mark)s and Hebrew NSMs are allowed to mix with
Latin with the current 'moderately restrictive script mixing policy'.
They're not blocked by BiDi check either because both LTR and RTL labels
can have an NSM.

Block them from coming after an unrelated script (e.g. Latin + Arabic

Bug:  chromium:729979 
Test: components_unittests --gtest_filter=*IDNToUni*
Change-Id: I5b93fbcf76d17121bf1baaa480ef3624424b3317
Reviewed-by: Peter Kasting <>
Commit-Queue: Jungshik Shin <>
Cr-Commit-Position: refs/heads/master@{#478205}

Project Member Comment 19 by, Jun 9
Labels: -Pri-1 Pri-2
Labels: M-60
Status: Fixed
I think this has to be merged to M-60. Will request for merge to 3112 after a few days of baking in canary (and dev if released). 

Any bounty for this?
Project Member Comment 22 by, Jun 10
Labels: -Restrict-View-SecurityTeam Restrict-View-SecurityNotify
Labels: reward-topanel
Typically, issues at Low severity are not awarded. However, I think this issue falls right on the boundary of Low/Medium (the spoof isn't perfect, but it isn't limited to Arabic), so I'll leave it for the panel to consider.
Labels: Merge-Request-60
Requesting for merge to M60 branch. It's a simple/safe patch. 
Project Member Comment 25 by, Jun 14
Labels: -Merge-Request-60 Hotlist-Merge-Review Merge-Review-60
This bug requires manual review: M60 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), josafat@(ChromeOS), bustamante@(Desktop)

For more details visit - Your friendly Sheriffbot
Labels: -Merge-Review-60 Merge-Approved-60
security bug, with a simple and safe fix. Approving merge for M60
Project Member Comment 27 by, Jun 20
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit - Your friendly Sheriffbot
Labels: -Merge-Approved-60 Disable-Nags Merge-Merged
Hmm... it's merged to M60 yesterday (3112 branch), but somehow it's not recorded here by bugdroid.

Labels: -Merge-Merged merge-merged-3112
Labels: -reward-topanel reward-unpaid reward-1000
Congratulations rayyanh12@! The VRP panel decided to award $1,000 for this bug! Thanks for the report.
Labels: -reward-unpaid reward-inprocess
Labels: Release-0-M60
Labels: CVE-2017-5105
Project Member Comment 35 by, Sep 16
Labels: -Restrict-View-SecurityNotify allpublic
This bug has been closed for more than 14 weeks. Removing security view restrictions.

For more details visit - Your friendly Sheriffbot
Sign in to add a comment