New issue
Advanced search Search tips

Consider holding back Georgian uppercasing in text-transform:uppercase

Project Member Reported by js...@chromium.org, Jul 19

Issue description

Unicode 11 added Georgian uppercase letters. Georgian script was previously treated as unicameral, but with the newly encoded Georgian uppercase letters, case-conversion rule was also revised to uppercase Georgianl (lowercase) letters to new Georgian uppercase letters. 

A trouble is that virtually nobody has a font covering new uppercase letters. 

Chrome 69.x has ICU 62.1. with a new case conversion rule for Georgian script. 

As a result, some web pages shows tofus (empty rectangles) when text-transform: uppercase is applied to Georgian. 

I hope we don't have to do this, but in the worst case, we have to consider this as a stopgap measure. 

See also https://github.com/googlei18n/noto-fonts/issues/1261

For Chrome OS, we can quickly update Georgian fonts (Noto * Georgian) with cmap hack described in the above github issue. 

We can also work with other OS vendors to do the same until they have real glyphs for Georgian uppercase letters to be added. 

Web font providers can be also contacted to update their Georgian fonts abd webmasters have to be advised to use web fonts. 

All of these may still fall short. This issue is filed to think about what Chrome can do in the worst case. 


See the two screenshots attached below with this issue. 

References:


http://www.unicode.org/versions/Unicode11.0.0/
... adds "Georgian Mtavruli capital letters, newly added to support modern casing practices"

(Some more information: https://en.wikipedia.org/wiki/Georgian_scripts)

 
Screen Shot 2018-07-19 at 5.18.41 AM.png
360 KB View Download
Screen Shot 2018-07-19 at 5.04.16 AM.png
496 KB View Download
Cc: drott@chromium.org kojii@chromium.org
Components: Blink>Fonts
A more drastic alternative (suggested offline)  is to map new Georgian uppercase letters to the corresponding existing/old Georgian (lowercase) letters when rendering. 

text-transform: upper / titlecase is the most important case and it's not very likely that web page authors would use new Georgian uppercase letters given virtually no font support on major platforms. 

However, just mapping uppercase letters to lowercase letters right before drawing seems to be easier than opting out Georgian letters CSS from text-transform: upper/title case. 

Yet another way (suggested offline)  is to patch Chrome's copy of ICU to hold back a newly added map (Georgian lower => upper) until the font situation is cleared up. It's easy to do but v8's spec compliance would be sacrificed ;-). I'm reluctant to do that because we're changing underlying casing rules to work around a 'surface' rendering issue.


Cc: behdad@chromium.org
Another sneaky idea is to change Harfbuzz to map Georgian uppercase to lowercase. 
It'll be simple and will handle both Blink and native UI (on some platforms ? or all platforms?). 


https://twitter.com/razmadzekoko/status/1018775401668251648  : Youtube has this issue and is discovered by a user. 


Description: Show this description
Cc: e...@chromium.org
Labels: -Type-Bug Type-Task
Status: Available (was: Untriaged)
Thank you for the info, this is very interesting.

Among your options, I mildly prefer patching ICU, because other options add some run-time cost, though we can limit it to only when 'text-transform' is applied.

If it adds too much maintenance burden compared to other options, I'm fine with other options too. I guess we probably want to do it while building layout tree than to do it during paint.

Have you contacted with other vendors to learn what they would do?
It's a bit of a chicken and egg problem. When would we disable the hack? I think the best way would be again to work on / provide universal fallback to Noto in Chrome, and have real Georgian uppercase letters in Noto.

Universal fallback is only really an option on android and chromeos, on other operating systems the fonts are outside our control.
I think Harfbuzz is another way and it can 'fix' multiple products in a single stroke (not just Chrome but also Firefox and Android). 

> Universal fallback is only really an option on android and chromeos, on other operating systems the fonts are outside our control.

Yes. Actually,  a font cmap work-around only works well for Chrome OS (we have 100% control over it). 

For Android, updating fonts is not that easy unless we can pretend that it's a security issue and bundle the font update as a part of security fix. 


From discussion so far, it sounds like the best would be to detect that no font has glyphs for Georgian capital letters, and (only) in that case map to the regular (now lowercase) letters and render those. And it sounds like HarfBuzz is too low-level (at least as used in Firefox), so this would want to happen in a layer above HarfBuzz.
I see that this issue does not have an owner. Does this mean that Chrome 69 CSS uppercasing will show tofus for Georgian, and we just push the problem to every web site and web app?
Jungshik or I can be the owner, if our decision is to have custom code for this in Blink. For now it would be going out in 69 with CSS uppercasing leading to undesirable results, yes. I don't have a lot of time for putting this on my list at the moment.

Sorry I was away for a few weeks. 

Given that in almost all cases newly encoded Georgian uppercase letters show up as a result of CSS text transform,  I believe it'd cover virtually all 'real' cases if CSS uppercasing code in Blink is hacked to leave alone (NOT change) Georgian letters.  

I have to refresh my memory as to what Blink used to do for Greek uppercasing before Greek uppercasing was fixed in ICU. Something similar (except that it has to be for any locale) might work for this case. 
Please note that the problem has surfaced on m69 beta. It was initially reported by a user, and I verified it.
https://productforums.google.com/forum/#!topic/chromebook-central/U_-2bkuXOnM



Labels: Hotlist-ConOps-CrOS
Cc: gov...@chromium.org js...@chromium.org yosin@chromium.org pbomm...@chromium.org
 Issue 880568  has been merged into this issue.
Status: Started (was: Available)
I'll make a quick temporary fix. 
Labels: Hotlist-ConOps
Labels: ReleaseBlock-Stable M-69 Target-70 M-70 Target-69
Tagging as M69 "RBS" for tracking, pls remove if needed. 
Labels: -Type-Task Type-Bug-Regression
Labels: -Pri-3 -ReleaseBlock-Stable ReleaseBlock-Beta Pri-1
Will not block M69 stable. 

Will make a CL and test it in ToT and M70 before merging back to M69. 

https://chromium-review.googlesource.com/1205712 is a WiP. It should work, but maybe it's not the best way to do it.  A more localized (to CSS) CL would be better. 

Change the CL to be specific to CSS text-transform instead of touching wtf/*string*.  It's sent to reviewers. 


FYI, Firefox implemented a similar change. 
Hey all,

Just following up from the consumer side to confirm we are seeing a spike in reports about this on M69 stable.

Reports Link:
https://listnr.corp.google.com/product/237/reports?searchText=Georgian&filter=0&dateRange=30&versions=69.0.3497.81

Screenshot of report samples: https://screenshot.googleplex.com/8dTimt83U8H.png

Sounds like a fix is in the works with a plan to merge - please consider this for any M69 respins.

Thanks!
Labels: ReleaseBlock-Stable
Adding "RBS" for M69 just in case we decide to take this mere in. 
Owner: js...@chromium.org
Project Member

Comment 28 by bugdroid1@chromium.org, Sep 6

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e81a2ee6dac6082a69cf7e2fc2c9071b279369cd

commit e81a2ee6dac6082a69cf7e2fc2c9071b279369cd
Author: Jungshik Shin <jshin@chromium.org>
Date: Thu Sep 06 16:46:57 2018

Disable new Georgian capital letters in CSS transform

Unicode 11 added a new set of Georgian captial letters and changed
the case-conversion rule to map lowercase (pre-Unicode 11) Georgian
letters to the new Goergian capital letters. [1]

However, virtually nobody has a font to cover the new characters.
Because these are new characters, they're very unlikely to show up
unless 'text-transform: uppercase' is applied to generate contents.

This CL maps new Georgian capital letters back to the corresponding
pre-Unicode 11 lowercase letters for CSS text-transform: uppercase until
new Georgian fonts are more widely available. CrOS will get a couple
of Georgian fonts supporting new characters pretty soon.

FYI, Firefox added a similar work around:
   https://bugzilla.mozilla.org/show_bug.cgi?id=1476304

Another FYI: There's no worry about title-casing because Georgian
title-casing is not supposed to capitalize the first letter of a word
and ICU's title-casing API leaves alone Georgian lowercase letters.

[1] http://unicode.org/versions/Unicode11.0.0/ : see section 'Casing
issues'.

Bug:  865427 
Test: fast/css/case-transform-georgian-capital.html
Change-Id: If957df8407cf08f6752499522518a90d3e55251e
Reviewed-on: https://chromium-review.googlesource.com/1205712
Reviewed-by: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Eric Willigers <ericwilligers@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/heads/master@{#589193}
[add] https://crrev.com/e81a2ee6dac6082a69cf7e2fc2c9071b279369cd/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital-expected.html
[add] https://crrev.com/e81a2ee6dac6082a69cf7e2fc2c9071b279369cd/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital.html
[modify] https://crrev.com/e81a2ee6dac6082a69cf7e2fc2c9071b279369cd/third_party/blink/renderer/core/style/computed_style.cc
[modify] https://crrev.com/e81a2ee6dac6082a69cf7e2fc2c9071b279369cd/third_party/blink/renderer/platform/text/character.h

NextAction: 2018-09-07
Pls update bug with canary result tomorrow. Thank you.
NextAction: 2018-09-07
Status: Fixed (was: Started)
Labels: Merge-TBD
[Auto-generated comment by a script] We noticed that this issue is targeted for M-69; it appears the fix may have landed after branch point, meaning a merge might be required. Please confirm if a merge is required here - if so add Merge-Request-69 label, otherwise remove Merge-TBD label. Thanks.
Project Member

Comment 33 by bugdroid1@chromium.org, Sep 6

Labels: merge-merged-3544
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3a06dfa9dce73d406f2b8e8c50d256aa0ddf816f

commit 3a06dfa9dce73d406f2b8e8c50d256aa0ddf816f
Author: Jungshik Shin <jshin@chromium.org>
Date: Thu Sep 06 19:41:24 2018

[Merge to canary 3544] Disable new Georgian capital letters in CSS transform

Unicode 11 added a new set of Georgian captial letters and changed
the case-conversion rule to map lowercase (pre-Unicode 11) Georgian
letters to the new Goergian capital letters. [1]

However, virtually nobody has a font to cover the new characters.
Because these are new characters, they're very unlikely to show up
unless 'text-transform: uppercase' is applied to generate contents.

This CL maps new Georgian capital letters back to the corresponding
pre-Unicode 11 lowercase letters for CSS text-transform: uppercase until
new Georgian fonts are more widely available. CrOS will get a couple
of Georgian fonts supporting new characters pretty soon.

FYI, Firefox added a similar work around:
   https://bugzilla.mozilla.org/show_bug.cgi?id=1476304

Another FYI: There's no worry about title-casing because Georgian
title-casing is not supposed to capitalize the first letter of a word
and ICU's title-casing API leaves alone Georgian lowercase letters.

[1] http://unicode.org/versions/Unicode11.0.0/ : see section 'Casing
issues'.

TBR=govind@chromium.org

Bug:  865427 
Test: fast/css/case-transform-georgian-capital.html
Change-Id: If957df8407cf08f6752499522518a90d3e55251e
Reviewed-on: https://chromium-review.googlesource.com/1205712
Reviewed-by: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Eric Willigers <ericwilligers@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#589193}(cherry picked from commit e81a2ee6dac6082a69cf7e2fc2c9071b279369cd)
Reviewed-on: https://chromium-review.googlesource.com/1211427
Reviewed-by: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/branch-heads/3544@{#6}
Cr-Branched-From: 0da7dce9a83cbad841f925002a4ddfee8f2afcf7-refs/heads/master@{#589076}
[add] https://crrev.com/3a06dfa9dce73d406f2b8e8c50d256aa0ddf816f/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital-expected.html
[add] https://crrev.com/3a06dfa9dce73d406f2b8e8c50d256aa0ddf816f/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital.html
[modify] https://crrev.com/3a06dfa9dce73d406f2b8e8c50d256aa0ddf816f/third_party/blink/renderer/core/style/computed_style.cc
[modify] https://crrev.com/3a06dfa9dce73d406f2b8e8c50d256aa0ddf816f/third_party/blink/renderer/platform/text/character.h

Canary version 71.0.3544.4 currently building includes this fix merged at #33. 
The NextAction date has arrived: 2018-09-07
 Issue 881707  has been merged into this issue.
Labels: TE-Verified-M71 TE-Verified-71.0.3545.2
Able to reproduce this issue on build without fix hence verifying the fix on latest canary 71.0.3545.2 using Mac 10.13.6, Windows 10 and Debian.

Now no tofus (empty rectangles) is seen when text-transform: uppercase is applied to Georgia . Attaching screenshot for reference.

As fix is working as expected adding Verified labels.

Thanks!
Screen Shot 2018-09-07 at 5.29.34 PM.png
551 KB View Download
Labels: -OS-iOS Merge-Request-69 Merge-Request-70
Status: Verified (was: Fixed)
Thank you for the verification. I also verified it. 

Asking for merge to M69 and M70. 

Justification: 
This issue affects any Georgian web pages using text-transform:uppercase. Due to the lack of fonts supporting newly encoded modern Georgian uppercase letters (in Unicode 11), uppercased Georgian letters (by text-transform: uppercase) turn into tofus (empty rectangles).   # of Georgian speakers is not huge (several millions), but its impact on them is rather large.  

The fix is rather simple and safe. It's also tested via layout test and verified in canary. 


Project Member

Comment 39 by sheriffbot@chromium.org, Sep 7

Labels: -Merge-Request-69 Merge-Review-69 Hotlist-Merge-Review
This bug requires manual review: Request affecting a post-stable build
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Merge-Review-69 Merge-Approved-69
Approving merge to M69 branch 3497 based on comment #38. Pls merge ASAP. Thank you.
Labels: -Merge-TBD -Merge-Request-70 Merge-Approved-70
Approving merge to branch 3538 M70. 
Project Member

Comment 42 by bugdroid1@chromium.org, Sep 7

Labels: -merge-approved-69 merge-merged-3497
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/34cba1117b9a2ed2ef0569d8f2b41999ed9d5bc2

commit 34cba1117b9a2ed2ef0569d8f2b41999ed9d5bc2
Author: Jungshik Shin <jshin@chromium.org>
Date: Fri Sep 07 18:02:10 2018

[Merge M69] Disable new Georgian capital letters in CSS transform

Unicode 11 added a new set of Georgian captial letters and changed
the case-conversion rule to map lowercase (pre-Unicode 11) Georgian
letters to the new Goergian capital letters. [1]

However, virtually nobody has a font to cover the new characters.
Because these are new characters, they're very unlikely to show up
unless 'text-transform: uppercase' is applied to generate contents.

This CL maps new Georgian capital letters back to the corresponding
pre-Unicode 11 lowercase letters for CSS text-transform: uppercase until
new Georgian fonts are more widely available. CrOS will get a couple
of Georgian fonts supporting new characters pretty soon.

FYI, Firefox added a similar work around:
   https://bugzilla.mozilla.org/show_bug.cgi?id=1476304

Another FYI: There's no worry about title-casing because Georgian
title-casing is not supposed to capitalize the first letter of a word
and ICU's title-casing API leaves alone Georgian lowercase letters.

[1] http://unicode.org/versions/Unicode11.0.0/ : see section 'Casing
issues'.

TBR=ericwilligers@chromium.org,drott@chromium.org

Bug:  865427 
Test: fast/css/case-transform-georgian-capital.html
Change-Id: If957df8407cf08f6752499522518a90d3e55251e
Reviewed-on: https://chromium-review.googlesource.com/1205712
Reviewed-by: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Eric Willigers <ericwilligers@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#589193}
Reviewed-on: https://chromium-review.googlesource.com/1213930
Reviewed-by: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/branch-heads/3497@{#904}
Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753}
[add] https://crrev.com/34cba1117b9a2ed2ef0569d8f2b41999ed9d5bc2/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital-expected.html
[add] https://crrev.com/34cba1117b9a2ed2ef0569d8f2b41999ed9d5bc2/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital.html
[modify] https://crrev.com/34cba1117b9a2ed2ef0569d8f2b41999ed9d5bc2/third_party/blink/renderer/core/style/computed_style.cc
[modify] https://crrev.com/34cba1117b9a2ed2ef0569d8f2b41999ed9d5bc2/third_party/blink/renderer/platform/text/character.h

I have same issue but on Android phone Pixel2. It doesn't shows some Georgian fonts. Reinstall chrome helps for some time... 
Screenshot_20180907-104905.png
105 KB View Download
Project Member

Comment 44 by bugdroid1@chromium.org, Sep 8

Labels: -merge-approved-70 merge-merged-3538
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a1514bc99314eb10f99a36584844624ff0f91436

commit a1514bc99314eb10f99a36584844624ff0f91436
Author: Jungshik Shin <jshin@chromium.org>
Date: Sat Sep 08 07:29:12 2018

[Merge to M70] Disable new Georgian capital letters in CSS transform

Unicode 11 added a new set of Georgian captial letters and changed
the case-conversion rule to map lowercase (pre-Unicode 11) Georgian
letters to the new Goergian capital letters. [1]

However, virtually nobody has a font to cover the new characters.
Because these are new characters, they're very unlikely to show up
unless 'text-transform: uppercase' is applied to generate contents.

This CL maps new Georgian capital letters back to the corresponding
pre-Unicode 11 lowercase letters for CSS text-transform: uppercase until
new Georgian fonts are more widely available. CrOS will get a couple
of Georgian fonts supporting new characters pretty soon.

FYI, Firefox added a similar work around:
   https://bugzilla.mozilla.org/show_bug.cgi?id=1476304

Another FYI: There's no worry about title-casing because Georgian
title-casing is not supposed to capitalize the first letter of a word
and ICU's title-casing API leaves alone Georgian lowercase letters.

[1] http://unicode.org/versions/Unicode11.0.0/ : see section 'Casing
issues'.

TBR=ericwilligers@chromium.org,drott@chromium.org

Bug:  865427 
Test: fast/css/case-transform-georgian-capital.html
Change-Id: If957df8407cf08f6752499522518a90d3e55251e
Reviewed-on: https://chromium-review.googlesource.com/1205712
Reviewed-by: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Eric Willigers <ericwilligers@chromium.org>
Commit-Queue: Jungshik Shin <jshin@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#589193}(cherry picked from commit e81a2ee6dac6082a69cf7e2fc2c9071b279369cd)
Reviewed-on: https://chromium-review.googlesource.com/1215044
Reviewed-by: Jungshik Shin <jshin@chromium.org>
Cr-Commit-Position: refs/branch-heads/3538@{#186}
Cr-Branched-From: 79f7c91a2b2a2932cd447fa6f865cb6662fa8fa6-refs/heads/master@{#587811}
[add] https://crrev.com/a1514bc99314eb10f99a36584844624ff0f91436/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital-expected.html
[add] https://crrev.com/a1514bc99314eb10f99a36584844624ff0f91436/third_party/WebKit/LayoutTests/fast/css/case-transform-georgian-capital.html
[modify] https://crrev.com/a1514bc99314eb10f99a36584844624ff0f91436/third_party/blink/renderer/core/style/computed_style.cc
[modify] https://crrev.com/a1514bc99314eb10f99a36584844624ff0f91436/third_party/blink/renderer/platform/text/character.h

Labels: -Hotlist-Merge-Review
All three branches (M69, M70 and M71) have this fix/work-around. 

 Issue 882203  has been merged into this issue.
After updating to released version 69.0.3497.92, which contains fix for this issue, Google Search results STILL shows those "empty rectangle" characters (See attached image). The website itself is fine, only search results are affected.
GoogleSearchResults.png
43.7 KB View Download
Hey all,

Confirming I can reproduce the behavior seen in #47 on today's stable OSX.

Do we know if Search is doing something non-standard, or could there be other sites with a similar implementation that would also be affected?

Thanks!
Hi, 
This issue is still present. e.g. this site remarkhq.com (see screenshot)
and android and ios google maps app (see screenshots) 
Screenshot 2018-10-04 at 4.32.41 PM.png
39.7 KB View Download
unnamed.jpg
142 KB View Download
43239895_503556170119205_8992354374592233472_n.jpg
53.8 KB View Download

Sign in to add a comment