Incorrect Emoji rendering on UI when Chinese character presents |
||||||||||
Issue descriptionVersion: 54 OS: Win What steps will reproduce the problem? data:text/html;charset=utf8,<title>%E3%80%8C%F0%9F%98%8A %F0%9F%98%8A</title>%E3%80%8C%F0%9F%98%8A %F0%9F%98%8A What is the expected output? Same rendering on Content and UI, both should show two [Colored] Emojis What do you see instead? The one that is not separated from the Chinese character is rendered as a crossed box instead an Emoji, both Omnibar and tab.
,
Jul 16 2016
Not on Mac
,
Jul 16 2016
,
Jul 19 2016
This might be due to HarfBuzz treating the Chinese character and the emoji as a single run. When I set a breakpoint in GetFallbackFont, I see it being asked for fallback for a single run of length 3 for text U+300c;U+d83d;U+de0a;, when in actuality it should be treated as two separate runs (first the Chinese character, then the emoji surrogate pair), or at least allow for multiple fallbacks per run. Font fallback might be helpful here, as the fallback code (at lest on Win) is able to figure out that the font can only render the first character of the run, but doesn't have a way to get that data back to HarfBuzz. I also observe a similar error if the emoji is first - in that case, the emoji is rendered correctly but the Chinese character displays as a box. Even more weird results happen if I vary the number of consecutive brackets or emoji.
,
Jul 20 2016
HarfBuzz does not do the segmentation into runs. Blink does.
,
Aug 18 2016
,
Aug 18 2016
Also broken on ChromeOS https://bugs.chromium.org/p/chromium/issues/detail?id=638709
,
Aug 18 2016
(behdad: This is in Chrome, not Blink) Medium term it would be great if we could use the blink text segmenter in Chrome as it handles all these cases, is maintained and continuously updated. It shouldn't be too hard (famous last words) to rework it to not depend on the rest of the blink text stack. As for now tweaking it as per comment 1 might do the trick.
,
Nov 29 2016
,
Nov 29 2016
,
Aug 22 2017
,
Aug 22 2017
,
Aug 22 2017
I'll take this in a m63 timeframe if nobody else does. I guess we just have to move ScriptRunIterator.h to third_party/WebKit/public/platform[/fonts?] then (maybe?) add a thin wrapper around it in ui/gfx/blink (a new folder). That would be a similar approach to what's done in ui/events/blink. Phase 2: ?? Phase 3: delete a bunch of redundant code from render_text_harfbuzz
,
Aug 23 2017
I can help with questions around RunSegmenter. ScriptRunIterator is only the script segmentation, RunSegmenter does emoji, script and orientation segmentation. Probably what would help here is segmenting the run into a Han run and and a separate emoji run. You could start by experimenting with https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/fonts/shaping/RunSegmenterTest.cpp and what run segmentation in returns for '「😊 😊'.
,
Aug 30 2017
,
Oct 3
Issue 881279 has been merged into this issue.
,
Oct 4
I think we should still use blink's text segmenter, but I'm booked for other things until at least next year. A simple fix for this issue and Issue 867196 is probably just to special-case emoji into its own runs (which I'm pretty sure is what the blink segmenter does anyway -- I see lots of special cases for emoji).
,
Oct 4
,
Oct 4
tapted@ would adding the correct block code(s) here do it? https://cs.chromium.org/chromium/src/ui/gfx/render_text_harfbuzz.cc?q=f:render_text_harfbuzz&sq=package:chromium&dr&l=63
,
Oct 5
I'm not sure. Probably not - https://codereview.chromium.org/1070223004/diff/710001/ui/gfx/render_text_harfbuzz.cc removed some things from IsUnusualBlockCode() to fix Issue 448909 https://bugs.chromium.org/p/chromium/issues/detail?id=448909#c2 says "emoji characters are 'COMMON' scripts in ICU"
,
Oct 5
Seems like the meat of Issue 448909 leads to here :( (via Issue 396415 ) The consensus there was to assign font per cluster as Blink does (modulo c#27, which brings up the "ransom note effect" which IMO seems like an OK tradeoff). So, long story short: c#17.
,
Oct 8
See also mid-term task issue 892589, where we move the emoji segmentation to a Ragel grammar that would be easier to reuse in the UI code. The grammar is ready and tested. What's left is the rather tedious steps of making Ragel available as part of the build process (issue 892601). |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by ebra...@gnu.org
, Jul 16 2016