Histograms should not block the main thread and IO thread (esp. during the critical path) |
|||||||
Issue descriptionSpin up from issue 533672 pdr@ shared a trace indicating that histogram creation is blocking the main/io thread during the critical path.
,
Jul 22 2016
Is this still reproducible? I remember from the blink onion soup talks at BlinkOn that histogram code has massively improved recently.
,
Jul 26 2016
I suspect it's probably still relevant. We do have an experimental persistent histograms implementation that bcwhite@ is planning to launch soon. However, the previous code path still exists as a fallback path. I think we should fix things to not do this work on IO thread. However, it's still kind of tricky. Because we do it on the IO thread now, besides the Jank, it also sometimes triggers (mostly harmless) data races that memory tools sometimes detect. Since most histograms are logged on main thread and this happens on IO thread, the memory tools yell that we're accessing - e.g. the flags field - from multiple threads at the same time without a lock. (Although the lock-free access is by design.) So moving to UI thread would mitigate this, but if it's Janky, we probably don't want to do it on UI thread either. One option is to actually somehow defer this work until after loading is done. Then, the thread it's on becomes less of an issue because the main problem highlighted by this bug (i.e. work while we're loading a page) gets solved. Happy to hear proposals on how best to do this and review the code.
,
Jul 26 2016
And just to be clear, the specific work I'm discussing (which wasn't clear from the screenshot, but I suspect is what's the report about) is the browser fetching serialized histograms from the renderer.
,
Aug 8 2016
,
Aug 8 2016
,
Aug 8 2016
I'm confused about what work is being discussed, too. Fetching histograms from the renderer happens about every 30 minutes on desktop when UMA reporting is enabled. The chance of that being _reproducably_ injected in the critical path seems near-zero.
,
Aug 10 2016
I tried to reproduce the issue and see what's problem, and caught a similar histogram task running on the ChildIOThread for 10ms. IPC message for this is ChildProcessHostMsg_ChildHistogramData. This doesn't seem to block anything, but look just curious. bcwhite@, is this the task you mentioned at c#7? This happened at the first navigation in a new tab in my case.
,
Aug 10 2016
Additional note: In my trace, this event seems to happen on each navigation.
,
Aug 10 2016
ChildHistogramData is the traditional way of fetching histograms from the renderer which is being largely superseded by the new persistent mechanism. Perhaps it was sent on each navigation to ensure that histograms didn't get lost when renderers exit. If so, that can be disabled when persistent collection is enabled. Let me investigate...
,
Aug 10 2016
Found the culprit: https://cs.chromium.org/chromium/src/chrome/renderer/page_load_histograms.cc?rcl=1470626560&l=879 // Since there are currently no guarantees that renderer histograms will be // sent to the browser, we initiate a PostTask here to be sure that we send // the histograms we generated. Without this call, pages that don't have an // on-close-handler might generate data that is lost when the renderer is // shutdown abruptly (perchance because the user closed the tab). I can skip this is persistent histograms are enabled (it should become a beta experiment soon) since histograms are guaranteed to be collected in that case. However, even without disabling it, its impact will be greatly reduced when the experiment is enabled because only non-persistent histograms will get sent by this call. Try adding these command-line flags and see how that affects your graphs: --enable-features='PersistentHistograms<PersistentHistograms' --force-fieldtrials=PersistentHistograms/cmdline --force-fieldtrial-params=PersistentHistograms.cmdline:storage/LocalMemory
,
Aug 10 2016
Note: page-load_histograms is being deprecated and will be removed soon. I wouldn't spend too much effort on this right now. +bmcquade who is driving this work. I think a lot of these histograms are marked obsolete in histograms.xml.
,
Aug 10 2016
It's not a lot of effort: https://codereview.chromium.org/2231843002/
,
Aug 10 2016
Oh yeah, that CL looks good to me :)
,
Aug 10 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a7742715fad96d7358b74df6b183b62765a15e63 commit a7742715fad96d7358b74df6b183b62765a15e63 Author: bcwhite <bcwhite@chromium.org> Date: Wed Aug 10 17:49:10 2016 Don't push histograms to browser after every page load when persistence is enabled. BUG= 630503 TBR=thestig@chromium.org Review-Url: https://codereview.chromium.org/2231843002 Cr-Commit-Position: refs/heads/master@{#411079} [modify] https://crrev.com/a7742715fad96d7358b74df6b183b62765a15e63/chrome/renderer/page_load_histograms.cc
,
Aug 15 2016
Thank you for investigation and quick fix. I attached a new graph. This is took by running r411938 with flags suggested at c#11. In this graph, the long running IO task was completely removed, and the second chunk of tasks on the CrRenderMain started immediately after the first chunk finished. As a result, the first paint happened a little earlier (though it's just a few msec for this simple test page). Let me close this as Fixed! |
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by kenjibaheux@chromium.org
, Jul 22 2016Status: Assigned (was: Untriaged)