WTF::ThreadSpecific should add a fast path to look up main thread's storage |
||||||
Issue descriptionOilpan has a clever hack [1] to look up main thread's storage without looking up actual thread-local storage. This was one of key optimizations to make Oilpan faster. We should have the same fast-path for WTF::ThreadSpecific. [1] https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/heap/ThreadState.h?q=threadstate.h&sq=package:chromium&dr=CSs&l=206
,
Oct 21 2016
yutak@ You were looking into this?
,
Dec 15 2016
I'll take a look at this if you haven't already started?
,
Dec 15 2016
Sounds great! I guess yutak@ has some experimental CL for this...? (but he is not working on it now.)
,
Jan 12 2017
slangley@, are you looking into this? We are thinking it might be blocking issue 442246 for perf reasons.
,
Jan 16 2017
slangley: If it is ok, I will take ownership.
,
Jan 17 2017
Sure - sorry for the delayed response.
,
Jan 19 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/098f72afae1133c762650b2c841981be9b4afe6e commit 098f72afae1133c762650b2c841981be9b4afe6e Author: csharrison <csharrison@chromium.org> Date: Thu Jan 19 01:12:33 2017 Fast path for ThreadSpecific for main thread on TLS-slow platforms This code is mainly a refactor of utility code out of platform/heap and into wtf/StackUtil. ThreadSpecific then copies some tricks done by ThreadState. BUG= 621786 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2 Review-Url: https://codereview.chromium.org/2623273007 Cr-Commit-Position: refs/heads/master@{#444581} [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/platform/heap/StackFrameDepth.cpp [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/platform/heap/StackFrameDepth.h [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/platform/heap/ThreadState.cpp [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/platform/heap/ThreadState.h [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/BUILD.gn [add] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/StackUtil.cpp [add] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/StackUtil.h [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/ThreadSpecific.h [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/ThreadingPthreads.cpp [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/ThreadingWin.cpp [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/WTFThreadData.cpp [modify] https://crrev.com/098f72afae1133c762650b2c841981be9b4afe6e/third_party/WebKit/Source/wtf/WTFThreadData.h
,
Jan 19 2017
I spoke with haraken@ about potentially simplifying ThreadSpecific to avoid the lazy allocation on access, and it seemed like we could just add an initialization path upon thread creation and just allocate all ThreadSpecifics there. This would entail having a static global "registry" of all the ThreadSpecifics, such that when a new thread is spawned, we would iterate through them, allocating each one. However, it seems impossible to ensure that every thread has a copy of each ThreadSpecific, given that the ThreadSpecifics are initialized at different times. For instance, Thread A could initialize thread specific X *after* Thread B has already initialized. How will Thread B initialize x if it also needs a value later?
,
Jan 19 2017
I don't think we'd want to do that since it means eagerly allocating everything which bloats the size of threads. I do think it might be nice to eagerly allocate the WTFThreadData whenever we start a thread though.
,
Jan 19 2017
Yeah I was wondering about the size cost of that idea. I'll look into just special casing WTFThreadData this way.
,
Jan 19 2017
WIP patch for #10: https://codereview.chromium.org/2646003003
,
Jan 20 2017
Yes, I was thinking we eagerly allocate only WTFThreadData and Oilpan's ThreadState (maybe you can move ThreadState into WTFThreadData though).
,
Jan 20 2017
haraken: would you look at the arguments in the linked patch in #12? We think the approach is not viable due to undefined destructor ordering.
,
Feb 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/395b9f9ab8004e4a0a4eb67d80f930a57964be9c commit 395b9f9ab8004e4a0a4eb67d80f930a57964be9c Author: csharrison <csharrison@chromium.org> Date: Fri Feb 03 03:21:13 2017 Avoid checking for WTFThreadData::staticData in wtfThreadData() This patch initialized the underlying ThreadSpecific<WTFThreadData>* in a separate code path during initialization, to avoid a branch in wtfThreadData(). BUG= 621786 Review-Url: https://codereview.chromium.org/2646003003 Cr-Commit-Position: refs/heads/master@{#447897} [modify] https://crrev.com/395b9f9ab8004e4a0a4eb67d80f930a57964be9c/third_party/WebKit/Source/wtf/ThreadingPthreads.cpp [modify] https://crrev.com/395b9f9ab8004e4a0a4eb67d80f930a57964be9c/third_party/WebKit/Source/wtf/ThreadingWin.cpp [modify] https://crrev.com/395b9f9ab8004e4a0a4eb67d80f930a57964be9c/third_party/WebKit/Source/wtf/WTFThreadData.cpp [modify] https://crrev.com/395b9f9ab8004e4a0a4eb67d80f930a57964be9c/third_party/WebKit/Source/wtf/WTFThreadData.h
,
Feb 3 2017
For TLS slow platforms, and for ThreadSpecifics that are initialized during threading initialization (e.g. ThreadState/WTFThreadData), we can remove a single branch (and call of internal::mayNotBeMainThread ) from the "fast path" of main thread access by initializing. We can't speed up the slow path. We can do something like templatizing on another bool (willInitializeDuringMainThreadInitialization... or something). Do people think this is worth the complexity? I'm not convinced mayNotBeMainThread is a bottleneck.
,
Feb 3 2017
Agreed. It wouldn't be worth the complexity. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by tkent@chromium.org
, Jun 23 2016