RenderTextHarfBuzz is slow on macOS |
|||||||||||||
Issue descriptionRenderTextHarfBuzz is causing significant UI-thread jank in MacViews. Examine the following trace -- this shows a 150 msec delay between pressing a key in the omnibox, and anything actually showing up. Calls to GetFallbackFont and GetFallbackFonts account for a huge delay, more than once. We also make several hundred calls to RenderTextHarfBuzz::ShapeRunWithFontInternal. We added a cache for the results of this call in issue 826265 , but the calls are still far too slow. After this massive 100-msec computation, we display the omnibox contents that are attached.
,
Jul 12
+drott/behdad (harfbuzz experts) may have some ideas/comments too. (Note we've added a text run cache, but it's not great for omnibox results, which need to be low-latency and often need to typeset URLs.) ccameron: Note in r480350 I tweaked the text run logic to always break at whitespace. I guess the URL example (with ".") wouldn't have been affected. But, generally, this is necessary for font fallbacks in i18n. E.g. the font that provides Hebrew glyphs on Mac (e.g. סיבית ) doesn't also provide a glyph for an ASCII space. Maybe we can introduce some special-case breaking rules for ASCII urls so we don't break at ".", and limit cases we break at " " as well. But this would (in theory) work against the text run cache. ( Issue 797215 captures some performance degradation around this too). That is, breaking at "words" should provide better cache hits, but that may fail since we're taking context into account. (Does Blink? Some of the comments about matching "words" in its cache suggest to me it does not). Maybe there are some simple rules we can use to limit the amount of unicode context we send to harfbuzz to get more cache hits.
,
Jul 12
,
Jul 12
,
Jul 12
,
Jul 12
,
Jul 13
We only use the adjacent context in the function arabic_joining, so the following patch disables context for non-RTL languages (super-hacky, but should help some): https://chromium-review.googlesource.com/c/chromium/src/+/1135723
,
Jul 13
I think we can do the suggested comment in #1 without any changes to the HB interface. In particular, we can find the RunLists that share all parameters except for range in ShapeRunLists at [1], and then we can parse all of the results together at [2]. [1] https://cs.chromium.org/chromium/src/ui/gfx/render_text_harfbuzz.cc?rcl=356993f45e105380f1ac1db6894e6038830d8569&l=1690 [2] https://cs.chromium.org/chromium/src/ui/gfx/render_text_harfbuzz.cc?rcl=356993f45e105380f1ac1db6894e6038830d8569&l=1063
,
Jul 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/791f1c19acbf159aab8fb9652e95b77222075239 commit 791f1c19acbf159aab8fb9652e95b77222075239 Author: Christopher Cameron <ccameron@chromium.org> Date: Fri Jul 13 03:58:48 2018 MacViews: Only use context for text shape when using RTL languages The call to hb_buffer_add_utf16 will read up to 5 characters adjacent to the specified range. This context is only used in the function arabic_joining. We can get much better caching if we don't include this context in our keys. Arabic is an RTL language, so we can omit this context in the key for all non-RTL languages. Bug: 860612 , 862773 Change-Id: Icc11059886ab835032141ad25b966d358f7177b3 Reviewed-on: https://chromium-review.googlesource.com/1135723 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Sidney San Martín <sdy@chromium.org> Cr-Commit-Position: refs/heads/master@{#574834} [modify] https://crrev.com/791f1c19acbf159aab8fb9652e95b77222075239/ui/gfx/render_text_harfbuzz.cc
,
Jul 13
I have a prototype of the idea in #8 at https://chromium-review.googlesource.com/c/chromium/src/+/1136042 Doing a release build to see if it actually helps.
,
Jul 13
I gathered data for the time it takes for the calls to ShapeRunList that result from me pressing the "t" key in the omnibox. This is the average over 50 runs, with caching disabled (with caching enabled, all but the first run are instantaneous).
This gets us a pretty decent speedup, ranging from none (or slightly negative) for small strings to 5-10x for the longer runs.
text | num | original | new
length | runs | time (msec) | time (msec)
-------+------+-------------+------------
1 | 1 | 0.0969 | 0.107
13 | 3 | 0.1928 | 0.151
20 | 5 | 0.3478 | 0.206
38 | 12 | 0.6326 | 0.270
59 | 17 | 0.8375 | 0.281
60 | 17 | 0.8399 | 0.281
95 | 38 | 1.9517 | 0.339
96 | 38 | 1.8519 | 0.328
97 | 38 | 1.8781 | 0.337
98 | 39 | 2.1015 | 0.392
179 | 54 | 3.3648 | 0.566
197 | 67 | 10.5980 | 1.160
The patch here is pretty messy (and doesn't handle all cases, but those cases aren't hit by this data). I'll try to create a cleaner version of it. The patch will probably split TextRunHarfBuzz into a "common parameters to multiple runs" and "run-specific data" section, and add hashing there.
,
Jul 13
FWIW, I investigated RenderTextMac a little this morning, but I'm stopping since you have an (easier) fix. Just a couple of data points of the omnibox's calls to RenderText::Elide:
text length | HarfBuzz (ms) | Mac (ms)
------------+---------------+---------
33 | 0.14885 | 0.18692
430 | 2.39401 | 0.22898
So the speedup is similar, which is good.
,
Jul 16
ccameron@, Please let us know clear steps to verify this issue from TE end. Thanks..!
,
Jul 16
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/102c5ce5dbc8e4823a1719b82bc790d328d0c77c commit 102c5ce5dbc8e4823a1719b82bc790d328d0c77c Author: Christopher Cameron <ccameron@chromium.org> Date: Mon Jul 16 16:52:29 2018 HarfBuzz: Separate common parameters and shape output Optimization part 1 of ~3 Separate TextRunHarfBuzz into three structures - the parameters that are common to multiple text runs - the output from shaping - everything else (the range and the post-shaping output) Replace the ShapeRunWithFontOutput structure with the TextRunHarfBuzz ShapeOutput structure. This is a step towards merging the HarfBuzz calls runs that share the same CommonParams. Bug: 862773 Change-Id: I940af96e0c105b079baef24c97034f24f05ef56a Reviewed-on: https://chromium-review.googlesource.com/1137852 Reviewed-by: Leonard Grey <lgrey@chromium.org> Commit-Queue: ccameron <ccameron@chromium.org> Cr-Commit-Position: refs/heads/master@{#575311} [modify] https://crrev.com/102c5ce5dbc8e4823a1719b82bc790d328d0c77c/ui/gfx/render_text_harfbuzz.cc [modify] https://crrev.com/102c5ce5dbc8e4823a1719b82bc790d328d0c77c/ui/gfx/render_text_harfbuzz.h [modify] https://crrev.com/102c5ce5dbc8e4823a1719b82bc790d328d0c77c/ui/gfx/render_text_unittest.cc
,
Jul 16
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/52d3b85518c057c8ff55ab2f1c21ff17afc68576 commit 52d3b85518c057c8ff55ab2f1c21ff17afc68576 Author: Christopher Cameron <ccameron@chromium.org> Date: Mon Jul 16 19:55:10 2018 HarfBuzz: Adjust parameterization of shape functions Optimization part 2 of ~3 This is in preparation for a follow-on change that will change RenderTextHarfBuzz::ShapeRun to take a single CommonParams, and a list of all runs that share these parameters. To that end, pull out explicit use of RenderTextHarfBuzz::common, and pass the CommonParams explicitly, wherever possible. Bug: 862773 Change-Id: Ia349c23ad58ab590c8ec6f05c6a9f1dcc253ab57 Reviewed-on: https://chromium-review.googlesource.com/1137891 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Leonard Grey <lgrey@chromium.org> Cr-Commit-Position: refs/heads/master@{#575392} [modify] https://crrev.com/52d3b85518c057c8ff55ab2f1c21ff17afc68576/ui/gfx/render_text_harfbuzz.cc [modify] https://crrev.com/52d3b85518c057c8ff55ab2f1c21ff17afc68576/ui/gfx/render_text_harfbuzz.h [modify] https://crrev.com/52d3b85518c057c8ff55ab2f1c21ff17afc68576/ui/gfx/render_text_unittest.cc
,
Jul 17
Ugh, this is turning out to be more complicated than I'd thought. I have a patch that enables merging calls into CoreText but ... there are hacks that I think go too far. For the curious, it's at https://chromium-review.googlesource.com/c/chromium/src/+/1139891 Beyond this, I don't see any reasonable-scope fixes to our use of HarfBuzz, or to the HarfBuzz API, which would fix this. Which is lower risk: That patch, or switching to RenderTextMac?
,
Jul 18
RenderTextMac is not feature-complete. It has no text-editing, selection ( Issue 722217 ), nor multiline support. It was never able to be cached efficiently by views::Label (see e.g., Issue 665280 , which led to r522617 and a saga in Issue 798927 ). RenderTextMac is a non-starter, and I don't think it's worth comparing performance of its ShapeRunList() implementation, since it lacks so much functionality we need, and has worse "real" performance when RenderText objects are cached inside toolkit-views.
,
Jul 19
MacViews triage: I'm stripping RBS from this and moving the target to M70. This is still important to fix - RTHB is one of our major perf bottlenecks - but it is no longer impacting enough to prevent ship and I don't think we'll have a solid answer by M69.
,
Jul 23
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9801165f33842fc79b27b2cf0f0f30681cc5f628 commit 9801165f33842fc79b27b2cf0f0f30681cc5f628 Author: Christopher Cameron <ccameron@chromium.org> Date: Mon Jul 23 23:42:01 2018 HarfBuzz: Identify runs with common font parameters Optimization part 2.5 of ~3 In ShapeRunsWithFont, identify runs that the same initial CommonParams and insert them into a RunVector. Change the rest of the shaping pipeline to operate on these RunVectors, and move most of the processing of CommonParams to be done once per RunVector instead of once per run. This change was intended to be part of the change to allow coalescing of similar calls into HarfBuzz (which may or may not land), but it was found to improve the performance in the cache hit case by a factor of almost 2. Bug: 862773 Change-Id: I8df02c2a8622eb9be74a65191fad79613ff1da2d Reviewed-on: https://chromium-review.googlesource.com/1145904 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Leonard Grey <lgrey@chromium.org> Cr-Commit-Position: refs/heads/master@{#577330} [modify] https://crrev.com/9801165f33842fc79b27b2cf0f0f30681cc5f628/ui/gfx/render_text_harfbuzz.cc [modify] https://crrev.com/9801165f33842fc79b27b2cf0f0f30681cc5f628/ui/gfx/render_text_harfbuzz.h [modify] https://crrev.com/9801165f33842fc79b27b2cf0f0f30681cc5f628/ui/gfx/render_text_unittest.cc
,
Jul 24
+eae, +behdad I took a look at this with eae@. The font that is slow is .AppleSystemUIFont, and the function that is slow is _hb_coretext_shape. This appears not to be an AAT font (because _hb_coretext_aat_shape isn't on the stack in #1), but the shape function is shared between _hb_coretext_aat_shape and _hb_coretext_shape (see [1]), and takes about 0.5 msec per instantiation (so, 0.5 msec per word in Chrome's UI). [1] https://cs.chromium.org/chromium/src/third_party/harfbuzz-ng/src/src/hb-coretext.cc?rcl=2b76767bf572364d3d647cdd139f2044a7ad06b2&l=1410
,
Jul 25
My guess is that we spend time recreating CTFont over and over again whereas it should be cached. Are you reusing the same hb_font_t?
,
Jul 25
It was *much* worse when we were recreating the CTFont each time :). Full beachball territory -- Issue 785522 . https://chromium-review.googlesource.com/851293 switched to using `hb_coretext_font_create(ct_font);` so that the CTFont is not recreated, just wrapped in the hb_font_t. I think we _could_ (and should) cache at a higher level than what gfx::CreateHarfBuzzFont [1] currently does, but that doesn't appear to be the main bottleneck. [1] https://cs.chromium.org/chromium/src/ui/gfx/harfbuzz_font_skia.cc?type=cs&q=gfx::CreateHarfBuzzFont&sq=package:chromium&g=0&l=284
,
Jul 25
Can you do some debugging, to see if we are recreating hb_face_t / hb_font_t for each word? That will help guide where I we should look next. Thanks.
,
Jul 25
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4e2ba897d5fb48e2c5a08dcde8a353432c9a2558 commit 4e2ba897d5fb48e2c5a08dcde8a353432c9a2558 Author: Christopher Cameron <ccameron@chromium.org> Date: Wed Jul 25 20:41:42 2018 HarfBuzz: Rename CommonParams to FontParams Bug: 862773 Change-Id: Idca82f1c4da83a5cff57f77a66bd2706e842f619 Reviewed-on: https://chromium-review.googlesource.com/1149277 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Leonard Grey <lgrey@chromium.org> Cr-Commit-Position: refs/heads/master@{#578043} [modify] https://crrev.com/4e2ba897d5fb48e2c5a08dcde8a353432c9a2558/ui/gfx/render_text_harfbuzz.cc [modify] https://crrev.com/4e2ba897d5fb48e2c5a08dcde8a353432c9a2558/ui/gfx/render_text_harfbuzz.h [modify] https://crrev.com/4e2ba897d5fb48e2c5a08dcde8a353432c9a2558/ui/gfx/render_text_unittest.cc
,
Jul 26
Requesting merge of the patches so far, to keep M69 in sync. There may be one more patch to improve performance, but that will probably be left until M70.
,
Jul 27
Your change meets the bar and is auto-approved for M69. Please go ahead and merge the CL to branch 3497 manually. Please contact milestone owner if you have questions. Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 27
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e4059f4faa614f3c5da7ca8bf55626bd2e058a97 commit e4059f4faa614f3c5da7ca8bf55626bd2e058a97 Author: Christopher Cameron <ccameron@chromium.org> Date: Fri Jul 27 21:14:31 2018 HarfBuzz: Identify runs with common font parameters Optimization part 2.5 of ~3 In ShapeRunsWithFont, identify runs that the same initial CommonParams and insert them into a RunVector. Change the rest of the shaping pipeline to operate on these RunVectors, and move most of the processing of CommonParams to be done once per RunVector instead of once per run. This change was intended to be part of the change to allow coalescing of similar calls into HarfBuzz (which may or may not land), but it was found to improve the performance in the cache hit case by a factor of almost 2. TBR=ccameron@chromium.org (cherry picked from commit 9801165f33842fc79b27b2cf0f0f30681cc5f628) Bug: 862773 Change-Id: I8df02c2a8622eb9be74a65191fad79613ff1da2d Reviewed-on: https://chromium-review.googlesource.com/1145904 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Leonard Grey <lgrey@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#577330} Reviewed-on: https://chromium-review.googlesource.com/1153655 Reviewed-by: ccameron <ccameron@chromium.org> Cr-Commit-Position: refs/branch-heads/3497@{#174} Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753} [modify] https://crrev.com/e4059f4faa614f3c5da7ca8bf55626bd2e058a97/ui/gfx/render_text_harfbuzz.cc [modify] https://crrev.com/e4059f4faa614f3c5da7ca8bf55626bd2e058a97/ui/gfx/render_text_harfbuzz.h [modify] https://crrev.com/e4059f4faa614f3c5da7ca8bf55626bd2e058a97/ui/gfx/render_text_unittest.cc
,
Jul 27
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a93590b5b30cb74c8d92222dfc637986053f9ee4 commit a93590b5b30cb74c8d92222dfc637986053f9ee4 Author: Christopher Cameron <ccameron@chromium.org> Date: Fri Jul 27 21:50:25 2018 HarfBuzz: Rename CommonParams to FontParams TBR=ccameron@chromium.org (cherry picked from commit 4e2ba897d5fb48e2c5a08dcde8a353432c9a2558) Bug: 862773 Change-Id: Idca82f1c4da83a5cff57f77a66bd2706e842f619 Reviewed-on: https://chromium-review.googlesource.com/1149277 Commit-Queue: ccameron <ccameron@chromium.org> Reviewed-by: Leonard Grey <lgrey@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#578043} Reviewed-on: https://chromium-review.googlesource.com/1153636 Reviewed-by: ccameron <ccameron@chromium.org> Cr-Commit-Position: refs/branch-heads/3497@{#179} Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753} [modify] https://crrev.com/a93590b5b30cb74c8d92222dfc637986053f9ee4/ui/gfx/render_text_harfbuzz.cc [modify] https://crrev.com/a93590b5b30cb74c8d92222dfc637986053f9ee4/ui/gfx/render_text_harfbuzz.h [modify] https://crrev.com/a93590b5b30cb74c8d92222dfc637986053f9ee4/ui/gfx/render_text_unittest.cc
,
Jul 31
Re #23: Yes, we are indeed re-creating the hb_font_t for every word at [1]. It would be very easy for us not to do that, but, like tapted@ mentioned in #22, that's not where the slowness is. The root of the slowness is the call to CTTypesetterCreateWithAttributedStringAndOptions at [2]. When I discussed this with eae@, it sounded as though it could be possible to shape these fonts without going through CoreText in this way -- did I understand that correctly? (this is a path that is shared with AAT fonts) [1] https://cs.chromium.org/chromium/src/ui/gfx/render_text_harfbuzz.cc?rcl=ec4400b28a913290681684212c9dfccb5ea103ed&l=1091 [2] https://cs.chromium.org/chromium/src/third_party/harfbuzz-ng/src/src/hb-coretext.cc?rcl=2b76767bf572364d3d647cdd139f2044a7ad06b2&l=970
,
Jul 31
If it's really easy, please fix the hb_font_t recreation first and then we'll look at what's slow. It doesn't currently make sense to me, as Blink is going through the same code path and does not have this problem.
,
Aug 1
Re #20: > This appears not to be an AAT font [...] If you see slowness through CoreText calls, such as CTTypesetterCreateWithAttributedStringAndOptions, then you're reaching the hb-coretext code path, which should only be taken for AAT fonts. On Mac OS from 10.12. San Francisco is the system UI font, which does make use of AAT features.
,
Aug 1
Is there a plan to improve shaping of AAT fonts in HarfBuzz?
,
Aug 1
(or alternatively, is there a plan to move the text shaping from blink into ui/, since it is much more effective at minimizing calls to CT?)
,
Aug 2
> Is there a plan to improve shaping of AAT fonts in HarfBuzz? Yes. We like to implement AAT shaping natively in HarfBuzz. Here's the issue tracking that: https://github.com/harfbuzz/harfbuzz/issues/322 This is about 50% implemented already. I hope to finish it later this year. That should give up huge speedup.
,
Oct 12
This sounds related to this: https://bugs.chromium.org/p/chromium/issues/detail?id=894459
,
Oct 14
Issue 619684 was a mac-specific bug relating to font fallbacks, which seems to be the bottleneck for Issue 894459 on Windows. See comments in Issue 619684 for more info. We should call gfx::GetFallbackFonts (plural) only in very rare cases on Mac. Likely it will _also_ be drawing the unicode string incorrectly (e.g. with tofu glyphs), and a GetFallbackFonts call is a last-ditch attempt to replace some of the tofu. On Mac, the API we call from bool GetFallbackFont(const Font& font, const base::char16* text, int text_length, Font* result); (i.e. CTFontCreateForString()), is likely to always give the optimal font for a given text run. (We should probably make GetFallbackFonts a member function on gfx::Font, which atomically caches the result after generating the vector the first time).
,
Oct 15
I'm adding UMA metrics to get more information on this: https://chromium-review.googlesource.com/c/chromium/src/+/1281227
,
Oct 29
ccameron@, do you have any reproducible benchmark steps or measurements that I could try out? Working together with behdad@, we are very close to merging native AAT support in HarfBuzz to work in Chrome, so that we do not need to go through CoreText anymore. It would be great to collect some numbers: If you have any steps, I could give it a go. Thanks. https://chromium-review.googlesource.com/c/chromium/src/+/1275945
,
Nov 28
That's great news! The benchmarks that I had were hacky things that I'd thrown into unittests (perhaps I should have committed them). Probably a good test would be to try using a bookmark folder (as in the video in issue 826265 ), with can_use_cache forced to false at [0]. I suspect that with the HB roll, the performance will be okay without the cache (though we can leave the cache for now). [0] https://cs.chromium.org/chromium/src/ui/gfx/render_text_harfbuzz.cc?rcl=684a87668622e1ae8e85bb1bb514ce6a42b43565&l=1909 |
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by ccameron@chromium.org
, Jul 11291 KB
291 KB View Download