New issue
Advanced search Search tips

Issue 595743 link

Starred by 2 users

Issue metadata

Status: Duplicate
Merged: issue 595960
Owner:
Closed: Mar 2016
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Mac
Pri: 3
Type: Bug



Sign in to add a comment

Some UTF-16 strings are not being rendered

Reported by slevin...@gmail.com, Mar 17 2016

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Firefox/45.0

Example URL:
http://codepen.io/Slevinski/pen/dMNVag

Steps to reproduce the problem:
1. Open a page that includes certain Unicode characters 
2. Notice that some characters are not being inserted into DOM

One particular problem is with a character on plane 16: U+100001 which is this character here "
 

Comment 1 by slevin...@gmail.com, Mar 17 2016

Happens on all operating systems, not just Mac.

Comment 2 by slevin...@gmail.com, Mar 17 2016

I looks like the characters are actually in the DOM, but not visible.  Even if a font is available for the invisible character, it will not be visible.  The bug affects the console log as well.  However, it is possible to select and copy the string and paste into another application, either from the DOM or the console log.
Components: -Blink Blink>Fonts
Summary: Some UTF-16 strings are not being rendered (was: Some UTF-16 strings are not being insertd into the DOM)

Comment 4 by slevin...@gmail.com, Mar 18 2016

It is more than just UTF-16.  UTF-32 strings are also having problems when inserting with &#xCODEPOINT; references, such as 􀀁

Comment 5 by e...@chromium.org, Mar 19 2016

Status: Fixed (was: Unconfirmed)
This was fixed a couple of months ago. Please upgrade to a recent version of chrome and retest.

Comment 6 by slevin...@gmail.com, Mar 19 2016

I'm glad this was fixed.  Do you know what version fixed this issue?

I am using the current stable release of Chrome/Chromium on the various operating systems.  Do I need to install a development releases to verify the fix?  How long until the stable releases catch up with the fix?

Comment 7 by slevin...@gmail.com, Mar 19 2016

This is NOT fixed in v50, v51, or Canary.  The problem still exists.
Status: Unconfirmed (was: Fixed)
Reopening per comment 7

Comment 9 by slevin...@gmail.com, Mar 19 2016

To open the test URL in Canary, you need to use the full view:
http://codepen.io/Slevinski/full/dMNVag/
Labels: -Pri-2 M-51 OS-Linux OS-Windows Pri-1
Status: Untriaged (was: Unconfirmed)
I am able to reproduce this issue on Latest Canary#51.0.2688.0, Dev#51.0.2687.0, Beta#50.0.2661.37 and Stable#49.0.2623.87 versions of chrome for Win7 64-bit OS, Mac OS X 10.11.4 and Linux Ubuntu 14.04.

This is not a Regression and observing the similar behavior from M26: 26.0.1410.46. Attached the screenshot for your reference.

Thank you!
595743.png
126 KB View Download
I updated Canary  to Version 51.0.2688.0 and tried again.  Much better.  There still seems to be a problem for characters U+E0001 thru U+E0FFF, but that's it.

For my work, characters on plane 15 and plane 16 are working as expected.  Thanks for the consideration.

Comment 13 by e...@chromium.org, Mar 23 2016

Owner: drott@chromium.org
Status: Available (was: Untriaged)
Re: 12: Thanks for verifying. We'll try to add fallback fonts for the remaining codepoints.

Re 11: Oh, the soft/light rendering is a separate problem then we're talking about here. 

Comment 14 by e...@chromium.org, Mar 23 2016

Labels: -Pri-1 Pri-3

Comment 15 by drott@chromium.org, Mar 24 2016

Mergedinto: 595960
Status: Duplicate (was: Available)
Thanks for the report. 

The underlying issue was an integer truncation issue in our character test functions, marking as duplicate of  issue 595960  in which this was identified as the root cause.

> There still seems to be a problem for characters U+E0001 thru U+E0FFF, but that's it.

U+E00001 to U+E001EF are from the 
https://codepoints.net/supplementary_special-purpose_plane
Language Tags and Variation Selectors. 

Compare http://unicode.org/faq/unsup_char.html#3 for guidelines on how to display those. We render this range as zero width spaces and in that sense follow the Unicode standard.

HarfBuzz shaping output for testing U+E00001:
../test/shaping/hb-unicode-encode "DB40 DC01" | ./hb-shape /Library/Fonts/Times\ New\ Roman.ttf 
[space=0+0]

(which means that U+E00001 is rendered using the space glyph with a zero advance width. So I believe the Canary rendering is correct, in fact for characters for the supplementary special purpose plan more correct than Firefox.



Thanks for the explanation about the supplementary special-purpose plane.  Makes perfect sense.

Sign in to add a comment