Many comparisons and assignment with string literals inefficiently call strlen |
|||||
Issue descriptionFor these comparisons in the critical path, we should be using StringView/StringPiece with an explicit length constructor. I just noticed one in WebURL::protocolIs and a few in //url code.
,
Dec 24 2016
3
,
Dec 25 2016
Yeah this is possible, I'm working through a change [1] that does this to StringPiece in base. [1]: https://codereview.chromium.org/2579473002/
,
Dec 25 2016
Oh, that's cool, thanks! One question, if you don't mind. Your CL will require to change all call sites to get a profit, isn't it? I thought about non-explicit constructor to avoid changing in all lines of code. But that can lead to a code bloat. Do you think that a helper function is better than a constructor for this?
,
Dec 25 2016
Adding a constructor is probably better, but I was having trouble getting the right constructor to be chosen in all circumstances over the constructor which takes a const char*. Note that there is some danger that the StringPiece could have embedded nulls if we don't add DCHECKs for it.
,
Dec 25 2016
Yes, I had the same problem with constructors. Non templated version is a more suitable for the compiler. Don't you mind if I will try to make a similar CL for the StringView class in Blink? I can compare binary size with and without templates.
,
Dec 25 2016
Yes of course! Feel free, I don't have much time atm because of holidays. esprehn@ may have an opinion on StringView stuff, so ccing.
,
Dec 25 2016
This isn't needed, the way StringPiece and StringView work we intentionally inline the strlen call and the compiler actually embeds the length in the binary for us and removes the call. I verified this in all our compilers with our optimization settings. We don't need to use templates and using them makes it ambiguous for usage with const char*.
,
Dec 25 2016
See ex. https://codereview.chromium.org/1844223002/ If you have some call stacks or build configs where this isn't happening I'd love to see it. I've been moving in the opposite direction though relying more and more on the strlen optimization and removing our hacks.
,
Dec 26 2016
I double checked this last week w/ a Linux build and found calls to strlen that were not inlined, but I may have gotten my build a little off. All optimizations is just is_debug=false and is_official_build=true right? Or are there more flags? The reason I filed the bug was because I saw these in profiles for e.g. WebURL::protocolIs() and a few other places for StringPiece.
,
Dec 26 2016
Interesting, let's get the disassembly of the functions where those calls are not getting inlined. I'd rather we fixed the clang optimizer (even if we need new flags) before we start adding ASCII literal hacks again. +krasin@ our friendly clang optimizer expert. :)
,
Dec 26 2016
If functions like WebURL::protocolIs() takes a |const char*| pointer as a parameter and than constructs a StringView object to compare it, strlen() cannot be calculated at the compile time. Maybe we can just use StringView instead of |const char*| as a parameter in functions like WebURL::protocolIs() ? Will it be enough to avoid calling strlen() ?
,
Dec 26 2016
Yeah those should take a StringPiece and be guarded by the !INSIDE_BLINK ifdefs. That should make the strlen into compile time calls and remove it from the profiles.
,
Dec 27 2016
Hmm, just got interested and ran instruments with ToT release build, but on my local Mac I wasn't able to see strlen for WebURL::protocolIs(). Optimization might be going differently on Linux or I'm missing something? (To add a note, I could confirm that strlen() surely shows up for WebURL::protocolIs() if I revert Charle's latest protocolIs optimization: https://chromium.googlesource.com/chromium/src/+/f08b4560dee36e6db75e84dad3ec8fbb80622639%5E%21/#F0 )
,
Dec 27 2016
I would expect it to show up there since it's a non inline function that takes a const char*. Anywhere we take a StringPiece/StringView should remove the strlen call for literals though so just making protocolIs take one of those would fix this call site.
,
Dec 27 2016
Yep, turning it to take StringPiece sounds good. What I did was sampled profiling so it probably just didn't catch it.
,
Dec 28 2016
I did a little more experiment out of curiosity, but it seems it may not work as desired in our cases.
I tried to change WebURL::protocolIs to take StringPiece, built chromium as release build (clang, -O2) and looked into how it gets compiled/linked by otool and lldb & disassemble. There're not many callsites of WebURL::protocolIs() and all of them take some URL constants like url::kDataScheme.
With is_official_build = true, is_debug = false config, all the following code still calls strlen:
> if (request.url().protocolIs(url::kDataScheme))
> if (request.url().protocolIs(base::StringPiece(url::kDataScheme)))
> if (request.url().protocolIs(base::StringPiece(url::kDataScheme, strlen(url::kDataScheme)))
000000000001fbec movq __ZN3url11kDataSchemeE(%rip), %rdi
000000000001fbf3 movq %rdi, -0x48(%rbp)
000000000001fbf7 callq _strlen
000000000001fbfc movq %rax, -0x40(%rbp)
000000000001fc00 leaq -0x48(%rbp), %rsi
000000000001fc04 movq %r12, %rdi
000000000001fc07 callq __ZNK5blink6WebURL10protocolIsERKN4base16BasicStringPieceINSt3__112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEEEE
While if the constant is defined in the same translation unit it successfully optimizes away strlen with an immediate value:
> if (request.url().protocolIs(kData)) // kData is locally defined static variable
> if (request.url().protocolIs("data"))
000000000001fbec leaq __ZZN7content15RenderFrameImpl30didLoadResourceFromMemoryCacheERKN5blink13WebURLRequestERKNS1_14WebURLResponseEE5kData(%rip), %rax
000000000001fbf3 movq %rax, -0x48(%rbp)
000000000001fbf7 movq $0x4, -0x40(%rbp)
000000000001fbff leaq -0x48(%rbp), %rsi
000000000001fc03 movq %r12, %rdi
000000000001fc06 callq __ZNK5blink6WebURL10protocolIsERKN4base16BasicStringPieceINSt3__112basic_stringIcNS3_1char_traitsIcEENS3_9allocatorIcEEEEEE
,
Jan 3 2017
I expect in many, many cases StringPiece/StringView is using a constant defined in another translation unit. Unless clang can fix this (way outside my area of expertise) it feels like we should do something about it without forcing devs to write things like StringPiece(chrome::kExtensionScheme, 17 /* please never change */) in hot code paths.
,
Jan 3 2017
+thakis who I had added as a reviewer for r2579473002
,
Jan 3 2017
Comment 18 sounds right to me, this is similar to an inline function and an out-of-line function. Are many of these in hot paths? If it's just a few, we can change // foo.h extern const char kFoo1[]; extern const char kFoo2[]; // foo.cc const char kFoo1[] = "foo1"; const char kFoo2[] = "foo2"; to // foo.h const char kFoo1[] = "foo1"; // intentionally inline; used in hot path extern const char kFoo2[]; // foo.cc const char kFoo2[] = "foo2"; const has internal linkage by default (unless explicitly disabled via `extern const`), so that way the literal will get emitted into every translation unit that uses that literal (and the later probably ICF'd again), but in return every translation unit will know the value of the string and will be able to compute strlen at compile time. (r2579473002 won't help either when the definition of a string isn't available, right?)
,
Jan 3 2017
That's a good point. I can try that and see what the effects are (and validate results from kinuko@ in #17). I'm not sure how many of these there are in hot paths. I am focused on loading code so I could probably just change the few I am noticing, but there are surely more in areas I am not profiling. Probably we can abandon that patch.
,
Jan 9 2017
WDYT about making class |base::StringPiece| to be constexpr? Can we use |base::StringPiece| instead of raw string literals? Something like that: // foo.h extern constexpr base::StringPiece kFoo; // foo.cc constexpr base::StringPiece kFoo = "foo";
,
Jan 9 2017
imo the issue here is "out of line string definitions can prevent some optimizations". The strlen goes away with StringPiece, but other optimizations are still prevented if the string is out of line. (Making parts of StringPiece constexpr and more like string_view probably does make sense independent of this bug, but I feel it's a bit different from what this bug is about maybe)
,
Jan 27 2017
(Slightly changing the subject)
,
Jan 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e973df73509337cabf6122bb4e15aa5a806e38b4 commit e973df73509337cabf6122bb4e15aa5a806e38b4 Author: kinuko <kinuko@chromium.org> Date: Fri Jan 27 15:14:58 2017 Make WTFString constructor that takes const char* call strlen inline This allows us to optimize away strlens for creating String from literal strings. (This actually showed up on my local profiling) BUG=673376 Review-Url: https://codereview.chromium.org/2658703005 Cr-Commit-Position: refs/heads/master@{#446670} [modify] https://crrev.com/e973df73509337cabf6122bb4e15aa5a806e38b4/third_party/WebKit/Source/wtf/text/WTFString.cpp [modify] https://crrev.com/e973df73509337cabf6122bb4e15aa5a806e38b4/third_party/WebKit/Source/wtf/text/WTFString.h |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by ice...@yandex-team.ru
, Dec 24 2016