New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 870007 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Aug 1
Cc:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 1
Type: Bug



Sign in to add a comment

RenderTextHarfBuzz: run breaking breaks at spaces in ASCII text

Project Member Reported by ellyjo...@chromium.org, Aug 1

Issue description

If the input text is a single line containing "abc def", RenderTextHarfBuzz::GetSubstringBounds() will return a vector of length 3, even though all the text is one contiguous "chunk". This is because of this logic in ScriptInterval():

  while (char_iterator.Advance()) {
    ScriptSetIntersect(char_iterator.get(), scripts, &scripts_size);
    if (scripts_size == 0U)
      return char_iterator.array_pos();
    *script = scripts[0];
  }

However, the script set for 'a', 'b', 'c', 'd', 'e', and 'f' is { USCRIPT_LATIN } and the script set for ' ' is { USCRIPT_COMMON }, which don't intersect, so any space is treated as a script change for HarfBuzz's purposes.

This probably causes a related issue where we separately shape each space-separated component of text.
 
Cc: ckocagil@chromium.org
Hmm, I could be wrong, but it seems like we should include USCRIPT_COMMON chars in any range containing chars of other scripts.
It seems like making that change would yield fewer, longer runs. I wonder what effect that might have on performance.
Thanks for tracking this down, any further investigation/fix is very much appreciated.
On the plus side, by breaking at spaces, we do end up hitting the cache more.
Good point. That said, ideally, we'd be caching by character (or similar, like Blink) instead of run.
I suppose we should only include USCRIPT_COMMON in surrounding ranges if it's a performance win.
The breaking for Latin scripts was added in r480350 to:
 - fix  Issue 731563 
 - reach towards parity with Blink behaviour, which is to break runs at whitespace
 - improve hits in a hypothetical text-run cache (which ccameron since added \o/)

The problem with USCRIPT_COMMON is that not all typefaces provide glyphs for all characters in it, resulting in badness (See the screenshot in  https://crbug.com/731563  ). E.g. the Hebrew typeface on mac doesn't have an ASCII space, so if we don't break the ASCII space into its own run we render garbage.

Long-term, we should completely merge Blink's and src/ui's run-breaking algorithm (and font-fallback mechanisms, and caching, etc.).


Status: WontFix (was: Assigned)
Okay, sounds like this is a won't fix (working as intended) then.

Sign in to add a comment