New issue
Advanced search Search tips

Issue 848295 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 24
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Task



Sign in to add a comment

HarfBuzzShaper::CollectFallbackHintChars can be expensive in some cases

Project Member Reported by kojii@chromium.org, May 31 2018

Issue description

I haven't done enough analysis but under some conditions, collecting hint characters in HarfBuzzShaper::CollectFallbackHintChars can be expensive.

The hot line from these analysis is:
  hint.push_back(hint_char);

* The first picture of crbug.com/645035#c8 showed 4.78% of total time.
* crbug.com/645035#c11 showed 76.27% of total time.

 

Comment 1 by kojii@chromium.org, Jun 1 2018

Ran external/wpt/selection/collapse-30.html in the debugger, we create an array of 65561 items. Vector's initial capacity is 4, and double as needed, so we're reallocating a lot in this case.

Comment 2 by e...@chromium.org, Jun 1 2018

65561, that's a lot...

Comment 3 by drott@chromium.org, Jun 1 2018

So in the first pass or so, coming in with 100k words, there are 65561 characters that were not shaped with the primary font? Could you sketch out what the call patterns are?

Comment 4 by kojii@chromium.org, Jun 4 2018

yosin@ and I read some of the code together today, it looks like the code collects all characters regardless whether fallback occurs or not. From what you said, it looks like it's not working as you intended? Or maybe we read only small portion of the code, we might have been wrong somewhere.

You can reproduce by adding "--enable-blink-features=LayoutNG" if you can look into it, or we can try to read and learn the code more if you're too busy.
Project Member

Comment 5 by bugdroid1@chromium.org, Jun 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/690b9ae8595d4c1aa21b296b16f55c0410d7457b

commit 690b9ae8595d4c1aa21b296b16f55c0410d7457b
Author: Koji Ishii <kojii@chromium.org>
Date: Mon Jun 04 17:02:46 2018

Avoid reallocations of fallback_chars_hint

Currently, CollectFallbackHintChars may collect all characters
passed to HarfBuzzShaper. In the current layout engine, this
is unlikely to be large because the layout engine and
CachingWordShaper takes care of the input strings to
HarfBuzzShaper to be short in most cases.

However, LayoutNG gives as long string as possible, and that
the reallocations is consuming CPU time in some tests.

This patch avoids reallocation by reservering enough capacity
up front. How much we really need to collect characters is
to be considered in future.

Bug:  848295 
Change-Id: Id9f751899549dad4835dc66b3a452738e1baa785
Reviewed-on: https://chromium-review.googlesource.com/1082144
Commit-Queue: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Emil A Eklund <eae@chromium.org>
Cr-Commit-Position: refs/heads/master@{#564120}
[modify] https://crrev.com/690b9ae8595d4c1aa21b296b16f55c0410d7457b/third_party/blink/renderer/platform/fonts/shaping/harf_buzz_shaper.cc

Comment 6 by yosin@chromium.org, Jun 5 2018

Total execution time is 28sec on Z840 Win10 release build.
hint.push_back(hint_char) still take 75.12% of total execution time

            bool HarfBuzzShaper::CollectFallbackHintChars(
                const Deque<ReshapeQueueItem>& reshape_queue,
                Vector<UChar32>& hint) const {
              if (!reshape_queue.size())
                return false;

              // Clear without releasing the capacity to avoid reallocations.
              hint.resize(0);

              size_t num_chars_added = 0;
              for (auto it = reshape_queue.begin(); it != reshape_queue.end(); ++it) {
                if (it->action_ == kReshapeQueueNextFont)
                  break;

                UChar32 hint_char;
                CHECK_LE((it->start_index_ + it->num_characters_), text_length_);
                UTF16TextIterator iterator(text_ + it->start_index_, it->num_characters_);
1908ms (6.77%)  while (iterator.Consume(hint_char)) {
21169ms (75.12%)  hint.push_back(hint_char);
1070ms (3.80%)    num_chars_added++;
1587ms (5.63%)    iterator.Advance();
                }
              }
              return num_chars_added > 0;
            }

Comment 7 by yosin@chromium.org, Jun 5 2018

For external/wpt/selection/collapse-30.html.
HarfBuzzShaper::CollectFallbackHintChars() is called 5137 times and produces
429,058 hint characters.

Comment 8 by kojii@chromium.org, Jun 5 2018

Another CL in queue, but this also improves how we parse and store the string, not the logic itself. I can help more efficient storage, but too unfamiliar with the logic, great if drott@ can jump in to help to improve the logic...

https://chromium-review.googlesource.com/c/chromium/src/+/1086810
Labels: -Pri-3 Pri-2
It indeed looks like we *always* collect fallback hint characters regardless of whether fallback occurs or not. Fixing that is likely a better solution than trying to optimize the collection further.

It also appears to loop over the same string multiple times.
For the latin line layout performance tests (perf_tests/layout/line-layout.html and friends) 60-80% of all shaping time is spent collecting fallback hints even though no fallback occurs.



The CL in #8 collects StringView (pointer and length) instead, and defer reading characters until it's needed. It's not really changing the fundamental logic at all, but is a bit more efficient.

drott@ said he can make it better at that moment, and yosin@ seems to have some more ideas by making larger changes. I stopped moving the CL forward but unfortunately we're not making the real progress.

If we're ok to make incremental progress, I can try to cleanup #8. Thoughts?
> It also appears to loop over the same string multiple times.
For the latin line layout performance tests (perf_tests/layout/line-layout.html and friends) 60-80% of all shaping time is spent collecting fallback hints even though no fallback occurs.

I can try to set some time aside later this week when I am back from my workshop and take a look. How do I run this test on the command line and enable NG? 
Thanks drott, I know you have your plate full at the moment, it's not super urgent and can certainly wait until you're back.

The easiest way is to open the perf test itself in a browser. If you compile with profiling enabled (enable_profiling = true) the following will capture profiling data:

./out/Release/chrome --no-sandbox --profiling-at-start=renderer --enable-blink-features=LayoutNG third_party/blink/perf_tests/layout/line-layout.html   

That'll create a chrome-profile-renderer-<pid> file in your working directory which in turn can be analyzed using pprof.

pprof -web chrome-profile-renderer-<pid>

The LayoutNG profiles spend more time in shaping (as there is no cache) so it's a bit easier to analyze the profiling data but the same patterns hold true for non-NG calls as well.

The hintlist is needed to compare it against one or multiple unicode-range definitions of a segmented font-family and trigger loading those. In short, if one of the unshaped characters in the TODO list is contained in one of the unicode ranges in the segmented font, load it.

In principle, this can be relevant even for the primary font, so hint chars are collected before shaping with the first font.

The implementation is quite generic in the sense that it always does that, even though there are no unicode-range restricted / segmented fonts in the fallback chain. If we don't have such a segmented font, or more specifically (if possible to implement) if we don't have such as the next upcoming fallback font, we do not need to collect all hint characters, but only the next best one, as described in FontFallbackList::ChooseHintIndex.

I'll see how complicated that would be to implement.






NG Win profile result:
https://pinpoint-dot-chromeperf.appspot.com/job/119b5c77a40000

Proflied one of the worst one, word-break-break-word is slower by 566%. This test consumes 90% in CollectFallbackHintChars, and 81.86% in this line:
        hint.push_back(text_[it->start_index_ + i]);
word-break-break-word on local Win x64 official:
  Current:      238ms
  NG ToT:     1,692ms
  NG + #8 CL:   100ms

NG seems to be faster for word-break-break-word if this issue was resolved. I'll try to run pinpoint with #8.
blink_perf.layout with and without #8 applied:
https://pinpoint-dot-chromeperf.appspot.com/job/14f04d34640000

What we really want to know, in order to analyze other perf issues, is to compare legacy with NG + #8. That's probably possible, haven't got to it yet.

Hopefully this can be resolved quickly enough that we can wait other perf work after this?
Experimental CL up on the CQ https://chromium-review.googlesource.com/c/chromium/src/+/1172662 

kojii@, can you try and get numbers with this one? I am not yet 100% certain it is entirely correct, but some local testing looks promising.

word-break-break-word on local Win x64 official:
  NG + #17 CL:   83ms

Looks promising.
📍 Job complete. See results below.
https://pinpoint-dot-chromeperf.appspot.com/job/14989b82640000
📍 Job complete. See results below.
https://pinpoint-dot-chromeperf.appspot.com/job/16becdb0640000
Owner: drott@chromium.org
Status: Started (was: Available)
Having given this some more thought regarding test cases, I'll write down a bit of a note to self:

The situation when collecting fallback hint chars is important especially in the following scenario:

* We're going through the font stack font family by font family.
* When selecting one family, we have to check whether it is a segmented font or not.
* If it is segmented, we go through the different @font-face declarations, with differing unicode-range descriptors in reverse order of declaration.
* Once we have determined that there is nothing more to shape with this segmented font family, we continue to the next family.

The important situation regarding hint chars is as follows, in an example:
1) The first character of the remaining text to be shaped does not need the first @font-face, but only uses the second.
2) The second character needs the first @font-face of the segmented font-face, but since we go through them linearly, we are not going to find it anymore and will shape this character with the wrong font.

If, at step 1 hint chars are collected, all @font-face's from a segmented font are selected. fast/css/font-face-unicode-range.html does cover this issue, at least for a simple sequence of a primary unsegmented font, and a secondary segmented font.

Project Member

Comment 25 by bugdroid1@chromium.org, Aug 24

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/59f30549864866a623ed7f2a0f21637be572fe62

commit 59f30549864866a623ed7f2a0f21637be572fe62
Author: Dominik Röttsches <drott@chromium.org>
Date: Fri Aug 24 16:42:39 2018

Skip building full hint list for non-segmented fonts

HarfBuzzShaper asks FontFallbackIterator whether a full hint list will
be required. A full hint list is only required when determining which
parts of a segmented font need to be downloaded. A segmented font face
is a face that is composed out of multiple @font-face rules with
different unicode-range: descriptors. The fallback characters need to be
compared against the unicode-range: values. In other cases, only the
first character that has a definite script is useful for fallback.

A segmented font is a set of @font-face declarations with the same
family name, but different unicode-range: descriptors.

The order of performing font fallback with segmented fonts in the stack
looks as follows:

* We're going through the font stack font family by font family.
* When selecting one family, we have to check whether it is a segmented
  font or not.
* If it is segmented, we go sequentially through the different
  @font-face declarations, with differing unicode-range descriptors in
  reverse order of declaration.
* Once we have determined that there is nothing more to shape with this
  segmented font family, we continue to the next family.

The important situation regarding hint chars is as follows, in an
example:

1) The first character of the remaining text to be shaped does not need
the first segment of a @font-face, but only uses the second.

2) The second character needs the first segment of the @font-face, but
since we go through them in order, we are not going to go back and find
it anymore and will shape this character with the wrong font.

If, at step 1 hint chars are collected, all @font-face's from a
segmented font are selected, as all characters are compared against all
unicode-ranges.

fast/css/font-face-unicode-range.html was transformed into a
inspector-protocol/layout style test and one case was added to check
that the above mechanism works after falling back from a primary font.

Bug:  848295 
Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_layout_ng
Change-Id: I38ddc196c02d92bb4608928772edee07203f98e7
Reviewed-on: https://chromium-review.googlesource.com/1172662
Commit-Queue: Dominik Röttsches <drott@chromium.org>
Reviewed-by: Emil A Eklund <eae@chromium.org>
Cr-Commit-Position: refs/heads/master@{#585872}
[delete] https://crrev.com/3af420d0f4d2c370a477186db5b90e0fe7db9b11/third_party/WebKit/LayoutTests/fast/css/font-face-unicode-range-expected.html
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/inspector-protocol/layout-fonts/resources/layout-font-test.js
[rename] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/inspector-protocol/layout-fonts/unicode-range-order.js
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/linux/inspector-protocol/layout-fonts/ogham-expected.txt
[add] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/linux/inspector-protocol/layout-fonts/unicode-range-order-expected.txt
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac-mac10.10/inspector-protocol/layout-fonts/ogham-expected.txt
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac-mac10.11/inspector-protocol/layout-fonts/ogham-expected.txt
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac/inspector-protocol/layout-fonts/ogham-expected.txt
[add] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac/inspector-protocol/layout-fonts/unicode-range-order-expected.txt
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/win/inspector-protocol/layout-fonts/ogham-expected.txt
[add] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/win/inspector-protocol/layout-fonts/unicode-range-order-expected.txt
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/win7/inspector-protocol/layout-fonts/ogham-expected.txt
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/font_fallback_iterator.cc
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/font_fallback_iterator.h
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/shaping/harfbuzz_shaper.cc
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/shaping/harfbuzz_shaper.h
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/text/character.cc
[modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/text/character.h

Status: Fixed (was: Started)
Thank you drott!

Sign in to add a comment