Annotate Samples from StackSamplingProfiler as Code Progresses |
|||
Issue descriptionThe StackSamplingProfiler gathers a fixed number of samples and sends the collected information via Metrics. The current use is "startup" but there is no way of associating which "phase" of startup is being sampled at any given moment. Thus, while UMA metrics might indicate that some phase is taking excess time, there is no way of breaking down the samples to know which belong to that phase. Add "annotation" support so that blocks of samples can be associated with particular phases of operation.
,
Oct 18 2016
I was thinking of a new "annotation" repeated field in the Sample structure of CallStackProfile:
// A sample consisting of one or more callstacks with the same stack frames
// and instruction pointers.
message Sample {
// The callstack. Sample.entries[0] represents the call on the top of the
// stack.
repeated Entry entry = 1;
// Number of times this stack signature occurs.
optional int64 count = 2;
// New annotations that were set between the last sample and this one. If
// the number is negative, it applies backwards, such as an action that
// just completed.
repeated int32 annotation = 3;
}
,
Oct 18 2016
This should be fairly easy when order is preserved but will complicate the un-ordered collection because it'll be able to merge only while the set of annotations remains constant. Still, shouldn't be too difficult.
,
Oct 18 2016
The jank profiler method seems like a reasonable way to go. I think we'll want to explicitly call out the phases in the proto in a similar fashion as well, to avoid confusion keeping things in sync between the Chrome and server side code.
,
Oct 19 2016
It's pretty much the same idea except it marks start-times and we'll generally know the finish times because that's when the UMA metrics are recorded.
,
Oct 24 2016
There is a "phase" of the system: startup, general operation, shutdown. Each of these may be split into many sub-phases. It's fine to just set bits as each completes and report the entire thing as a bit-field. I say bit-field instead of a counter because it makes it easy to further subdivide the phases in the future. But within any of these phases there may be things that start and end: rendering a page, writing profiles to disk, doing a metrics upload, etc. While the former flags would get set and remain set the latter would be set and cleared as those tasks begin and end. The former will be small and dense but the latter could be very large and sparse. Think that for now they can both be handled as simple bit-fields but it may be necessary in the future to use a different mechanism for the latter if the number of flags becomes very large.
,
Nov 3 2016
Kicking off discussion of what state to record: I suggest recording phases associated with the relevant timing heartbeat histograms: NewTabPage.LoadTime Startup.FirstWebContents.MainNavigationStart PageLoad.PaintTiming.NavigationToFirstContentfulPaint SessionRestore.ForegroundTabFirstLoaded Startup.FirstWebContents.NonEmptyPaint2 Plus the non-duplicative timing histograms listed under the StartupTemperature section of histograms.xml: https://chromium.googlesource.com/chromium/src/+/fc917a6b8f127fa85e7864d2b7a0c7088bfaa86c/tools/metrics/histograms/histograms.xml#111549 Plus Startup.Temperature itself (this is a per-profile state, rather than a phase). Anything else you can think of Alexei?
,
Nov 4 2016
Nothing per-profile will work because only the first setting of a "phase" state will be remembered. Other things for the startup before contents? MainLoop started? Though that could be tricky because the flags will have to be defined in components (or something accessible by components). Hmmm... Also... What other activities happen that could affect performance and should be monitored? Metrics Collection? Writing Profile State? Opening New Profile?
,
Nov 4 2016
+gab for his thoughts on what stages we want to breakdown sampling profiler data for (initially) I'm ok with us not adding too many to start with, just some important ones: Startup.FirstWebContents.MainNavigationStart Startup.FirstWebContents.NonEmptyPaint2 For other ones, I'm not 100% sure, here are my thoughts: - NewTabPage.LoadTime (this will be logged multiple times, so not sure we should have it) - SessionRestore.ForegroundTabFirstLoaded (maybe interesting - I guess we'd only hit it on session restore start ups which will be a subset - but I guess useful in specifically diagnosing regressions in that metric, so probably fine) - PageLoad.PaintTiming.NavigationToFirstContentfulPaint (again, loaded multiple times - so not sure if we want it)
,
Nov 7 2016
Phases that allow us to reconstruct a mental image of a startup trace as in https://docs.google.com/document/d/1MN7I9f_7lUthZkmDaTBHsCtOUB8ApiJ8y-AIbs6ubxI/edit#heading=h.dfg4l0unq4cd would be useful. e.g.: Startup.BrowserWindowDisplay (albeit being aware that this metric is potentially broken: issue 589118 ). The initial ProfileManager::GetProfile (need to look closer but looks like Profile.CreateProfileHelperTime might be that) StartupBrowserCreator::Start (Startup.StartupBrowserCreator_Start) Startup.MessageLoopStartTime is probably also interesting. I like bcwhite's idea of using a bit field so that overlapping phases can be tracked.
,
Nov 23 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/30c14f27cf63a48b082efe67ae29bb6e4282cbd7 commit 30c14f27cf63a48b082efe67ae29bb6e4282cbd7 Author: bcwhite <bcwhite@chromium.org> Date: Wed Nov 23 22:56:14 2016 Add process lifetime annotations to stack samples. Annotations are passed in the proto along with the stack frames so analysis can correlate with other things happening on the system. BUG= 657012 Review-Url: https://codereview.chromium.org/2444143002 Cr-Commit-Position: refs/heads/master@{#434273} [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/base/profiler/native_stack_sampler.h [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/base/profiler/native_stack_sampler_posix.cc [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/base/profiler/native_stack_sampler_win.cc [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/base/profiler/stack_sampling_profiler.cc [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/base/profiler/stack_sampling_profiler.h [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/base/profiler/stack_sampling_profiler_unittest.cc [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/call_stack_profile_metrics_provider.cc [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/call_stack_profile_metrics_provider.h [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/call_stack_profile_metrics_provider_unittest.cc [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/proto/call_stack_profile.proto [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/proto/execution_context.proto [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/public/cpp/call_stack_profile.typemap [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/public/cpp/call_stack_profile_struct_traits.h [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/public/cpp/call_stack_profile_struct_traits_unittest.cc [modify] https://crrev.com/30c14f27cf63a48b082efe67ae29bb6e4282cbd7/components/metrics/public/interfaces/call_stack_profile_collector.mojom
,
Jan 12 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/da097d30d0e99243c59c75d336a761b0ec5e8def commit da097d30d0e99243c59c75d336a761b0ec5e8def Author: bcwhite <bcwhite@chromium.org> Date: Thu Jan 12 23:55:03 2017 Set process phases in the StackSamplingProfiler. BUG= 657012 Review-Url: https://codereview.chromium.org/2530043002 Cr-Commit-Position: refs/heads/master@{#443394} [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/base/profiler/stack_sampling_profiler.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/base/profiler/stack_sampling_profiler.h [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/base/profiler/stack_sampling_profiler_unittest.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/chrome/browser/chrome_browser_main.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/chrome/browser/chrome_browser_main.h [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/metrics/call_stack_profile_metrics_provider.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/metrics/call_stack_profile_metrics_provider.h [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/metrics/call_stack_profile_metrics_provider_unittest.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/metrics/proto/execution_context.proto [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/metrics/public/cpp/call_stack_profile_struct_traits.h [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/metrics/public/interfaces/call_stack_profile_collector.mojom [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/startup_metric_utils/browser/BUILD.gn [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/startup_metric_utils/browser/DEPS [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/components/startup_metric_utils/browser/startup_metric_utils.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/content/browser/browser_main_loop.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/content/browser/browser_main_loop.h [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/content/browser/browser_main_runner.cc [modify] https://crrev.com/da097d30d0e99243c59c75d336a761b0ec5e8def/content/public/browser/browser_main_parts.h
,
Jan 25 2017
,
Jan 30 2017
Can you create a separate bug to cover propagation of annotations to non-browser processes, if that won't be implemented under this bug? |
|||
►
Sign in to add a comment |
|||
Comment 1 by asvitk...@chromium.org
, Oct 18 2016