Lock contention in DWrite fonts used by Skia from multiple raster worker threads. |
|||||||
Issue descriptionI looked into RasterWorkerPool tasks for TTFP (time to first paint) improvement. And found that worker threads managed by RasterWorkerPool has smaller ratio of CPU Duration than I expect: On the attached trace log, the rasterization for the first paint takes 98ms, and its total CPU duration of all 4 compositor workers is 123ms. I think this is as slow as it's done in a single thread. On the rasterization, compositor workers spent most of time to wait for a lock acquisition to access thread-unsafe platform APIs. On Linux, majority of time was spent to acquire gFTMutex in SkFontHost_FreeType.cpp to access FreeType. And on Windows with Direct Write support, DWriteFactoryMutex in SkScalerContext_win_dw.cpp takes most of time. Also on Windows, the lock overhead and the context switch overhead look considerable. Here is a TRACE_EVENT injection I did on Linux: https://codereview.chromium.org/2021923002/ I think we can improve the rasterization performance if we do either: - Make the lock coarser, - Use smaller number of worker threads, - Use thread-safe API or make the API thread safe, if possible.
,
May 31 2016
Here are trace logs on Windows with/without a patch: https://codereview.chromium.org/2019383003/ If we simply remove all locks in SkScalerContext_win_dw.cpp, the rasterization for the first paint of hatébu gets 64ms faster (from 80ms to 16ms). I'm not sure if we can safely remove the lock, but since we make DWrite factory with DWRITE_FACTORY_TYPE_SHARED flag, DW objects are probably thread-safe. https://msdn.microsoft.com/en-us/library/windows/desktop/dd368057(v=vs.85).aspx
,
May 31 2016
,
May 31 2016
It would be nice if DirectWrite objects were always thread safe (and they are supposed to be) but on at least the DirectWrite implementation in Win8 this isn't always the case. The CL which introduced the comment your patch removes about DirectWrite not being thread safe on Win8 (https://codereview.chromium.org/1421433004) links to the bug which has an example of what sometimes happens there. As far as I know this has never been observed in Win7, and I'm not sure we've looked too hard at Win10. It may be possible to predicate these locks based on the version of Windows or DirectWrite (we do a similar thing with FontConfig). Also, this seems somewhat surprising. In general layout is done on a single thread, so there should be very low contention on any locks. Do you have any idea where the contention is coming from?
,
May 31 2016
Oh, sad to hear it. Do we have good channel to MS to ask the thread safety on OSes? This is rasterization done in a worker pool, while layout is done in the single main thread. The worker pool has 4 threads in this case, and the lock owner seems to switch every sub milliseconds.
,
Jun 2 2016
,
Jan 18 2017
Font object contention
,
May 3 2017
,
May 4 2018
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 18 2018
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by kinuko@chromium.org
, May 31 2016