New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 634437 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 1
Type: Bug



Sign in to add a comment

Missing breakdowns for Startup.FirstWebContents.MainNavigationStart startup metric

Project Member Reported by asvitk...@chromium.org, Aug 4 2016

Issue description

Missing breakdowns for Startup.FirstWebContents.MainNavigationStart startup metric.

In go/m51-startup-regression-postmortem, we realized we don't have good breakdowns for the metric that regressed. In particular, we're missing:

BrowserMainRunnerImpl::Initialize end to MainNavigationStart

Given that this is where (we believe) the bulk of the regression happened, we should add this sub-metric.

This is a P1 from the M51 startup postmortem.

fdoray@, is this something you could help with? Let me know if you're too busy and we need to find a different owner.

 

Comment 1 by gab@chromium.org, Aug 5 2016

So you're saying Startup.FirstWebContents.BrowserMainInitTo.Foo? Foo = {MainNavigationStart, NonEmptyPaint2, ...}

Is that useful? It's the diff of "full histogram - time to browser main init" for each one. It would have regressed by its proportional share, don't see how it would have helped...
Yep, those exactly.

I do think it's useful. It was very confusing to reason about when me and Rob were looking at it, so having top-level metrics (like MainNavigationStart) having break downs would be very good to quickly figure out which phase the problem is.

I think another aspect to this is also just documenting this - for example we had to really read the code and look at traces to figure out that the metric for BrowserMainRunnerImpl::Initialize related to a specific subset of this one. So once we have the full breakdown, we should document the submetrics in the description for the main one.

And as to why it's better than the subtraction - it gives a more trustworthy signal if we can get things to exactly add up. Every uncertainty makes investigation harder. The other problem is we didn't also have knowledge about the ExeMain metric as well (and that didn't exist on Mac), so it made it hard to reason about it.

I think the point is if we can have these metrics and then document the breakdown in the description of the high level metric, then it makes it much more clear and easier to reason about - and no need to jump through hoops (e.g. no need to subtract - which also doesn't work well for percentiles as well).
For the Startup.LoadTime.ProcessCreateToExeMain not being logged on Mac, which caused some confusion during investigation, I filed crbug.com/634964 and will look at fixing.
My understanding of startup metrics: https://docs.google.com/a/google.com/presentation/d/1zox31n-KvwF1wN1p7qSZ3oMsaaSBhi1xipR6Q_nRLgg/edit?usp=sharing (Google-only).

Would it be clearer if we had these startup metrics?
- Startup.[metric].FromProcessCreate (instead of Startup.[metric])
- Startup.[metric].FromMainEntry (do we want to add this?)
- Startup.[metric].FromUIMessageLoopStart
- Startup.BrowserWindowDisplay.FromProcessCreate (instead of Startup.BrowserWindowDisplay)
- Startup.BrowserMessageLoopStart.FromProcessCreate (instead of Startup.BrowserMessageLoopStartTime)
- Startup.BrowserMessageLoopStart.FromMainEntry (instead of Startup.BrowserMessageLoopStartTimeFromMainEntry)
- Startup.RendererMain.FromMainEntry (instead of Startup.BrowserMainToRendererMain)
- Startup.RendererMain.FromProcessCreate
- Startup.ProcessCreateToExeMain
- Startup.ExeMainToDllMain
- Startup.ProcessCreateToDllMain

Where [metric] is one of:
- FirstWebContents.NonEmptyPaint2
- FirstWebContents.MainFrameLoad2
- FirstWebContents.MainNavigationFinished
- FirstWebContents.MainNavigationStart

UI MessageLoop start is approximately the same thing as BrowserMainRunnerImpl::Initialize end.

Comment 5 by gab@chromium.org, Aug 8 2016

Also [metrics] should be suffixed with "(Cold|Warm|Lukewarm)Startup" -- and we also have another suffix for "same version startup count"...

I'm fine with tons of metrics if we think the granularity is worth it but **only** if it's clear which 2-3 metric we should keep an eye on and have the others as investigation helpers only (i.e. I don't want to need a dozen of Chirp subscriptions).
gab@, yes, everything will still be suffixed with the startup temperature and the same version startup count.

Comment 7 by gab@chromium.org, Aug 9 2016

Labels: -M-54 M-55
Targeting M55 when fdoray gets back.
Could any one let us know is there any latest update available for this issue? Since this issue is marked as P1 for M55 milestone and M55 is already in Beta and sooner it will be pushed to stable.
Labels: -M-55 M-56
Bumping milestone since this didn't make into M55.
Cc: fdoray@chromium.org
Labels: -M-56 M-60
Owner: asvitk...@chromium.org
Taking this.
Project Member

Comment 11 by bugdroid1@chromium.org, May 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d80d9426357c72cc7d547d84196890efb999be36

commit d80d9426357c72cc7d547d84196890efb999be36
Author: asvitkine <asvitkine@chromium.org>
Date: Wed May 10 15:00:30 2017

Add Startup.BrowserMessageLoopStart.To.NonEmptyPaint2 metric.

This measures the time between Startup.BrowserMessageLoopStartTime
being logged and Startup.FirstWebContents.NonEmptyPaint2 being logged.

Also adds Startup.BrowserMessageLoopStart.To.MainNavigationStart.

Goal is to be able to easier diagnose problems and also understand impact
of improvements. For example, I plan to use this when I do more experimentation
with my resource reordering change.

BUG= 634437 

Review-Url: https://codereview.chromium.org/2870993002
Cr-Commit-Position: refs/heads/master@{#470573}

[modify] https://crrev.com/d80d9426357c72cc7d547d84196890efb999be36/components/startup_metric_utils/browser/startup_metric_utils.cc
[modify] https://crrev.com/d80d9426357c72cc7d547d84196890efb999be36/tools/metrics/histograms/histograms.xml

Status: Fixed (was: Assigned)
Marking as fixed based on the above CL, which added timings between message loop start to navigation start and NonEmptyPaint. We can file follow-up bugs if we want additional breakdowns, but I think these two should be good for now - and will help me with some follow up investigation of my resource reordering changes. 

Sign in to add a comment