HarfBuzzShaper::CollectFallbackHintChars can be expensive in some cases |
||||
Issue descriptionI haven't done enough analysis but under some conditions, collecting hint characters in HarfBuzzShaper::CollectFallbackHintChars can be expensive. The hot line from these analysis is: hint.push_back(hint_char); * The first picture of crbug.com/645035#c8 showed 4.78% of total time. * crbug.com/645035#c11 showed 76.27% of total time.
,
Jun 1 2018
65561, that's a lot...
,
Jun 1 2018
So in the first pass or so, coming in with 100k words, there are 65561 characters that were not shaped with the primary font? Could you sketch out what the call patterns are?
,
Jun 4 2018
yosin@ and I read some of the code together today, it looks like the code collects all characters regardless whether fallback occurs or not. From what you said, it looks like it's not working as you intended? Or maybe we read only small portion of the code, we might have been wrong somewhere. You can reproduce by adding "--enable-blink-features=LayoutNG" if you can look into it, or we can try to read and learn the code more if you're too busy.
,
Jun 4 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/690b9ae8595d4c1aa21b296b16f55c0410d7457b commit 690b9ae8595d4c1aa21b296b16f55c0410d7457b Author: Koji Ishii <kojii@chromium.org> Date: Mon Jun 04 17:02:46 2018 Avoid reallocations of fallback_chars_hint Currently, CollectFallbackHintChars may collect all characters passed to HarfBuzzShaper. In the current layout engine, this is unlikely to be large because the layout engine and CachingWordShaper takes care of the input strings to HarfBuzzShaper to be short in most cases. However, LayoutNG gives as long string as possible, and that the reallocations is consuming CPU time in some tests. This patch avoids reallocation by reservering enough capacity up front. How much we really need to collect characters is to be considered in future. Bug: 848295 Change-Id: Id9f751899549dad4835dc66b3a452738e1baa785 Reviewed-on: https://chromium-review.googlesource.com/1082144 Commit-Queue: Dominik Röttsches <drott@chromium.org> Reviewed-by: Dominik Röttsches <drott@chromium.org> Reviewed-by: Emil A Eklund <eae@chromium.org> Cr-Commit-Position: refs/heads/master@{#564120} [modify] https://crrev.com/690b9ae8595d4c1aa21b296b16f55c0410d7457b/third_party/blink/renderer/platform/fonts/shaping/harf_buzz_shaper.cc
,
Jun 5 2018
Total execution time is 28sec on Z840 Win10 release build.
hint.push_back(hint_char) still take 75.12% of total execution time
bool HarfBuzzShaper::CollectFallbackHintChars(
const Deque<ReshapeQueueItem>& reshape_queue,
Vector<UChar32>& hint) const {
if (!reshape_queue.size())
return false;
// Clear without releasing the capacity to avoid reallocations.
hint.resize(0);
size_t num_chars_added = 0;
for (auto it = reshape_queue.begin(); it != reshape_queue.end(); ++it) {
if (it->action_ == kReshapeQueueNextFont)
break;
UChar32 hint_char;
CHECK_LE((it->start_index_ + it->num_characters_), text_length_);
UTF16TextIterator iterator(text_ + it->start_index_, it->num_characters_);
1908ms (6.77%) while (iterator.Consume(hint_char)) {
21169ms (75.12%) hint.push_back(hint_char);
1070ms (3.80%) num_chars_added++;
1587ms (5.63%) iterator.Advance();
}
}
return num_chars_added > 0;
}
,
Jun 5 2018
For external/wpt/selection/collapse-30.html. HarfBuzzShaper::CollectFallbackHintChars() is called 5137 times and produces 429,058 hint characters.
,
Jun 5 2018
Another CL in queue, but this also improves how we parse and store the string, not the logic itself. I can help more efficient storage, but too unfamiliar with the logic, great if drott@ can jump in to help to improve the logic... https://chromium-review.googlesource.com/c/chromium/src/+/1086810
,
Jul 15
It indeed looks like we *always* collect fallback hint characters regardless of whether fallback occurs or not. Fixing that is likely a better solution than trying to optimize the collection further. It also appears to loop over the same string multiple times. For the latin line layout performance tests (perf_tests/layout/line-layout.html and friends) 60-80% of all shaping time is spent collecting fallback hints even though no fallback occurs.
,
Jul 16
The CL in #8 collects StringView (pointer and length) instead, and defer reading characters until it's needed. It's not really changing the fundamental logic at all, but is a bit more efficient. drott@ said he can make it better at that moment, and yosin@ seems to have some more ideas by making larger changes. I stopped moving the CL forward but unfortunately we're not making the real progress. If we're ok to make incremental progress, I can try to cleanup #8. Thoughts?
,
Jul 16
> It also appears to loop over the same string multiple times. For the latin line layout performance tests (perf_tests/layout/line-layout.html and friends) 60-80% of all shaping time is spent collecting fallback hints even though no fallback occurs. I can try to set some time aside later this week when I am back from my workshop and take a look. How do I run this test on the command line and enable NG?
,
Jul 16
Thanks drott, I know you have your plate full at the moment, it's not super urgent and can certainly wait until you're back. The easiest way is to open the perf test itself in a browser. If you compile with profiling enabled (enable_profiling = true) the following will capture profiling data: ./out/Release/chrome --no-sandbox --profiling-at-start=renderer --enable-blink-features=LayoutNG third_party/blink/perf_tests/layout/line-layout.html That'll create a chrome-profile-renderer-<pid> file in your working directory which in turn can be analyzed using pprof. pprof -web chrome-profile-renderer-<pid> The LayoutNG profiles spend more time in shaping (as there is no cache) so it's a bit easier to analyze the profiling data but the same patterns hold true for non-NG calls as well.
,
Jul 19
The hintlist is needed to compare it against one or multiple unicode-range definitions of a segmented font-family and trigger loading those. In short, if one of the unshaped characters in the TODO list is contained in one of the unicode ranges in the segmented font, load it. In principle, this can be relevant even for the primary font, so hint chars are collected before shaping with the first font. The implementation is quite generic in the sense that it always does that, even though there are no unicode-range restricted / segmented fonts in the fallback chain. If we don't have such a segmented font, or more specifically (if possible to implement) if we don't have such as the next upcoming fallback font, we do not need to collect all hint characters, but only the next best one, as described in FontFallbackList::ChooseHintIndex. I'll see how complicated that would be to implement.
,
Aug 6
NG Win profile result: https://pinpoint-dot-chromeperf.appspot.com/job/119b5c77a40000 Proflied one of the worst one, word-break-break-word is slower by 566%. This test consumes 90% in CollectFallbackHintChars, and 81.86% in this line: hint.push_back(text_[it->start_index_ + i]);
,
Aug 6
word-break-break-word on local Win x64 official: Current: 238ms NG ToT: 1,692ms NG + #8 CL: 100ms NG seems to be faster for word-break-break-word if this issue was resolved. I'll try to run pinpoint with #8.
,
Aug 6
blink_perf.layout with and without #8 applied: https://pinpoint-dot-chromeperf.appspot.com/job/14f04d34640000 What we really want to know, in order to analyze other perf issues, is to compare legacy with NG + #8. That's probably possible, haven't got to it yet. Hopefully this can be resolved quickly enough that we can wait other perf work after this?
,
Aug 13
Experimental CL up on the CQ https://chromium-review.googlesource.com/c/chromium/src/+/1172662 kojii@, can you try and get numbers with this one? I am not yet 100% certain it is entirely correct, but some local testing looks promising.
,
Aug 14
word-break-break-word on local Win x64 official: NG + #17 CL: 83ms Looks promising.
,
Aug 14
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/14989b82640000
,
Aug 14
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/14989b82640000
,
Aug 14
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/16becdb0640000
,
Aug 14
📍 Job complete. See results below. https://pinpoint-dot-chromeperf.appspot.com/job/16becdb0640000
,
Aug 23
,
Aug 24
Having given this some more thought regarding test cases, I'll write down a bit of a note to self: The situation when collecting fallback hint chars is important especially in the following scenario: * We're going through the font stack font family by font family. * When selecting one family, we have to check whether it is a segmented font or not. * If it is segmented, we go through the different @font-face declarations, with differing unicode-range descriptors in reverse order of declaration. * Once we have determined that there is nothing more to shape with this segmented font family, we continue to the next family. The important situation regarding hint chars is as follows, in an example: 1) The first character of the remaining text to be shaped does not need the first @font-face, but only uses the second. 2) The second character needs the first @font-face of the segmented font-face, but since we go through them linearly, we are not going to find it anymore and will shape this character with the wrong font. If, at step 1 hint chars are collected, all @font-face's from a segmented font are selected. fast/css/font-face-unicode-range.html does cover this issue, at least for a simple sequence of a primary unsegmented font, and a secondary segmented font.
,
Aug 24
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/59f30549864866a623ed7f2a0f21637be572fe62 commit 59f30549864866a623ed7f2a0f21637be572fe62 Author: Dominik Röttsches <drott@chromium.org> Date: Fri Aug 24 16:42:39 2018 Skip building full hint list for non-segmented fonts HarfBuzzShaper asks FontFallbackIterator whether a full hint list will be required. A full hint list is only required when determining which parts of a segmented font need to be downloaded. A segmented font face is a face that is composed out of multiple @font-face rules with different unicode-range: descriptors. The fallback characters need to be compared against the unicode-range: values. In other cases, only the first character that has a definite script is useful for fallback. A segmented font is a set of @font-face declarations with the same family name, but different unicode-range: descriptors. The order of performing font fallback with segmented fonts in the stack looks as follows: * We're going through the font stack font family by font family. * When selecting one family, we have to check whether it is a segmented font or not. * If it is segmented, we go sequentially through the different @font-face declarations, with differing unicode-range descriptors in reverse order of declaration. * Once we have determined that there is nothing more to shape with this segmented font family, we continue to the next family. The important situation regarding hint chars is as follows, in an example: 1) The first character of the remaining text to be shaped does not need the first segment of a @font-face, but only uses the second. 2) The second character needs the first segment of the @font-face, but since we go through them in order, we are not going to go back and find it anymore and will shape this character with the wrong font. If, at step 1 hint chars are collected, all @font-face's from a segmented font are selected, as all characters are compared against all unicode-ranges. fast/css/font-face-unicode-range.html was transformed into a inspector-protocol/layout style test and one case was added to check that the above mechanism works after falling back from a primary font. Bug: 848295 Cq-Include-Trybots: luci.chromium.try:linux_layout_tests_layout_ng Change-Id: I38ddc196c02d92bb4608928772edee07203f98e7 Reviewed-on: https://chromium-review.googlesource.com/1172662 Commit-Queue: Dominik Röttsches <drott@chromium.org> Reviewed-by: Emil A Eklund <eae@chromium.org> Cr-Commit-Position: refs/heads/master@{#585872} [delete] https://crrev.com/3af420d0f4d2c370a477186db5b90e0fe7db9b11/third_party/WebKit/LayoutTests/fast/css/font-face-unicode-range-expected.html [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/inspector-protocol/layout-fonts/resources/layout-font-test.js [rename] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/inspector-protocol/layout-fonts/unicode-range-order.js [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/linux/inspector-protocol/layout-fonts/ogham-expected.txt [add] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/linux/inspector-protocol/layout-fonts/unicode-range-order-expected.txt [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac-mac10.10/inspector-protocol/layout-fonts/ogham-expected.txt [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac-mac10.11/inspector-protocol/layout-fonts/ogham-expected.txt [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac/inspector-protocol/layout-fonts/ogham-expected.txt [add] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/mac/inspector-protocol/layout-fonts/unicode-range-order-expected.txt [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/win/inspector-protocol/layout-fonts/ogham-expected.txt [add] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/win/inspector-protocol/layout-fonts/unicode-range-order-expected.txt [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/WebKit/LayoutTests/platform/win7/inspector-protocol/layout-fonts/ogham-expected.txt [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/font_fallback_iterator.cc [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/font_fallback_iterator.h [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/shaping/harfbuzz_shaper.cc [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/fonts/shaping/harfbuzz_shaper.h [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/text/character.cc [modify] https://crrev.com/59f30549864866a623ed7f2a0f21637be572fe62/third_party/blink/renderer/platform/text/character.h
,
Aug 24
Thank you drott! |
||||
►
Sign in to add a comment |
||||
Comment 1 by kojii@chromium.org
, Jun 1 2018