Invalid unicode characters produce incorrect character positions |
|||||
Issue descriptionVersion: 51.0.2679.0/dev OS: OSX10.11.6 What steps will reproduce the problem? (1) Open the attached test file (2) Try to select just 'c' The character positions are incorrect when there are invalid unicode characters. For example, the string "򐀒n" has a length of 3 but the shape results buffer will only contain a single RunInfo of length 2. Normally the RunInfo lengths add up to the total string length (even with non-bmp characters like 😂) but with invalid unicode characters they do not. This results in broken selection (see the attached testcase) and is the root cause of https://crbug.com/595393. I'm a little out of my depth in how to fix this though. The bug here is not in HarfBuzz itself, but in how we process the results. @jshin or @drott, is this something you could look into?
,
Mar 18 2016
Thanks for the quality report - I can look into it. How did you find this? Did this occur on a real world page?
,
Mar 18 2016
Not a real page, just our friend Clusterfuzz :) I recently changed SVG's text layout code to directly use the shaped output so it only needs to shape a run of text once. Previously, we were shaping the complete run for each character which was O(n^2). Clusterfuzz then found https://crbug.com/595393 which ended up being this issue. The way we extract out the glyph advances closely follows the selection code which is how both svg and selection share this bug. I have a workaround for the svg issue in https://codereview.chromium.org/1816453002.
,
Mar 22 2016
Just to keep track of this result, HarfBuzz shapes this rather regularly: ../test/shaping/hb-unicode-encode "U+90012 U+90012 U+0061 U+90012 U+90012 U+0062 U+90012 U+90012 U+0063 U+90012 U+90012 U+0064"|./hb-shape /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf [.notdef=0+1229|.notdef=1+1229|a=2+1255|.notdef=3+1229|.notdef=4+1229|b=5+1300|.notdef=6+1229|.notdef=7+1229|c=8+1126|.notdef=9+1229|.notdef=10+1229|d=11+1300]
,
Mar 22 2016
FF and Edge show .notdefs for the invalid codepoints.
,
Mar 22 2016
Fix proposal up: https://codereview.chromium.org/1825023002
,
Mar 22 2016
This exposed a larger issue of lack of warnings for integer truncations, filed issue 596846 for that.
,
Mar 22 2016
,
Mar 22 2016
,
Mar 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3f51d6256864b298b15feb93a57c6d986474245a commit 3f51d6256864b298b15feb93a57c6d986474245a Author: drott <drott@chromium.org> Date: Tue Mar 22 14:32:39 2016 Fix integer truncation in Character test functions The test functions checking for treatAs*Space and isNormalizedCanvasSpaceCharacter were using UChar parameter types, while they were used with UChar32 arguments in HarfBuzzShaper. This leads to characters being truncated, equivalent to being masekd with 0xFFFF, before they get passed to the space-tester functions. If the truncated value matches in one of those, they get blanked out with a space in the normalization step in HarfBuzzShaper. BUG= 595960 R=pdr,eae,kojii Review URL: https://codereview.chromium.org/1825023002 Cr-Commit-Position: refs/heads/master@{#382569} [modify] https://crrev.com/3f51d6256864b298b15feb93a57c6d986474245a/third_party/WebKit/Source/platform/fonts/Character.h [modify] https://crrev.com/3f51d6256864b298b15feb93a57c6d986474245a/third_party/WebKit/Source/platform/fonts/CharacterTest.cpp
,
Mar 22 2016
,
Mar 24 2016
Issue 595743 has been merged into this issue.
,
Jun 6 2016
Issue 597331 has been merged into this issue. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by pdr@chromium.org
, Mar 18 2016