New issue
Advanced search Search tips

Issue 704988 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocked on:
issue 700617

Blocking:
issue 711721



Sign in to add a comment

Change how Net.JobControllerSet.* histograms are reported.

Project Member Reported by xunji...@chromium.org, Mar 24 2017

Issue description

This is to follow up on  Issue 700617 .

During a discussion, rch@ pointed out that logging Net.JobControllerSet.* during HttpStreamFactory tear down isn't a good idea, because the data will not be representative of users who seldom shut down their browser.

We need a follow-up CL to change how the histograms are reported. ssid@ pointed out we could make use MetricsService::StartScheduledUpload to do periodic update. It's unclear whether net/ can use that directly. 
 

Comment 1 by ssid@chromium.org, Mar 24 2017

On Android the browser never shuts down. So, we won't be getting any metrics at all. There is good chance that browser also goes non-responsive if such leaks happen and users tend to kill it instead of shutting down.

I suggested we could use the chrome metrics service to record this metric.
I am not very familiar with either code, but from what I read ChromeMetricsServiceClient::CollectFinalHistograms() gets called every N minutes to collect memory histograms for all processes. You could just introduce a "net/metrics/metrics_service" and ChromeMetricsServiceClient will call into NetMetricsService to report histograms.
For example, Memory.Browser.Large2 is collected similarly.

Now, I suggested all this assuming that a global call can collect this metrics. But, looking at the actual code it looks like HttpSteamFactoryImpl objects are not global and we are recording metrics for each of these objects which are owned by url request and others. So, this might not be possible. What we could do here is maybe post recurring tasks every 10 minutes and record histograms for each http_factory. This would give an average use case of the number of jobs.
Good point on the Android behavior!
(HttpSreamFactoryImpl is essentially a global. It's only torn down when the profile is torn down.)
Labels: -Pri-2 M-59 Pri-1
Let's try to get this in by M59's branch cut.
Cc: asvitk...@chromium.org
+asvitkine@: besides what ssid@ suggested in Comment 1, do you have any other thoughts? 

I guess if we want to follow the MemoryGrowthTracker's example, we might need to create a NetMetricsTracker or some such. I'm not familiar with ChromeMetricsServiceClient either, wonder whether there would be any better alternative to handle this. 
The memory metrics are collected every time we cut a log. This is useful, because during analysis you can correlate other histograms with values reported in these.

To do that, you can extend NetworkMetricsProvider to report these metrics - i.e. by logging them from ProvideGeneralMetrics().
I have looked at NetworkMetricsProvider::ProvideGeneralMetrics(), where we upload data via NetworkChangeNotifier::FinalizingMetricsLogRecord() or LogAggregatedMetrics. It's not trivial to me how this would work since JobController count is not plumbed to NetworkChangeNotifier. 

Since we need this histogram work asap to detect/monitor memory leaks similar to  Issue 678768  and  Issue 700617 . I am going to change the histogram to be updated everytime a new JobController is created. And go back later to figure out how we could improve. 
One idea:
Make a new observer type in NetworkChangeNotifer which is called upon OnFinalizingMetricsLogRecord().
Make HttpStreamFactory an NetworkChangeNotifier observer of this new type.
Log UMAs in observer's OnFinalizingMetricsLogRecord().


Labels: sr-pm-5
Talked to zhongyi@ and consulted mmenke@ offline. 

Here's an alternative plan:
Since we are interested only in whether there are any leaks in this area, and not the exact number of controllers.
We could log |job_controller_set_| only when the set is too big, e.g. >= 200. This should reduce how many times we need to iterate std::set and give us what we want.
Labels: -Pri-1 Pri-2
Lower down the priority to P2, we will be able to collect data on histograms later this week. Will need to read the data from those histograms to decide whether future improvements needed to better detect memory leaks.
Project Member

Comment 12 by bugdroid1@chromium.org, Apr 12 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a95a63804d48bfb9acfcaabc124504419dcd3668

commit a95a63804d48bfb9acfcaabc124504419dcd3668
Author: zhongyi <zhongyi@chromium.org>
Date: Wed Apr 12 23:05:34 2017

Only log the job controller count when it's hitting boundaries for the first time so that when
normal job controllers are getting created, and deleted, which introduces a zigzag pattern in job
controller count, we will not log the same user multiple times.

BUG= 704988 

Review-Url: https://codereview.chromium.org/2815153002
Cr-Commit-Position: refs/heads/master@{#464193}

[modify] https://crrev.com/a95a63804d48bfb9acfcaabc124504419dcd3668/net/http/http_stream_factory_impl.cc
[modify] https://crrev.com/a95a63804d48bfb9acfcaabc124504419dcd3668/net/http/http_stream_factory_impl.h

Project Member

Comment 13 by bugdroid1@chromium.org, Apr 15 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c77104dfbdbdb4e4952dad12fa2d2352a07e65ff

commit c77104dfbdbdb4e4952dad12fa2d2352a07e65ff
Author: zhongyi <zhongyi@chromium.org>
Date: Sat Apr 15 01:43:17 2017

Add one more histogram to log how many Request are still pending when
the job controller count hits the limit: 100, 200, etc.

We have observed potential memory leaks in crbug.com/711721, which
indicates the leak of main job when racing logic is disabled. If the
Request is finished and destructed, the main job must be cleaned up in
OnRequestComplete(). We used to assume the Request will abort if
timed out, maybe there are corner cases where Request will leak.

BUG= 704988 

Review-Url: https://codereview.chromium.org/2824513003
Cr-Commit-Position: refs/heads/master@{#464847}

[modify] https://crrev.com/c77104dfbdbdb4e4952dad12fa2d2352a07e65ff/net/http/http_stream_factory_impl.cc
[modify] https://crrev.com/c77104dfbdbdb4e4952dad12fa2d2352a07e65ff/net/http/http_stream_factory_impl_job_controller.h
[modify] https://crrev.com/c77104dfbdbdb4e4952dad12fa2d2352a07e65ff/tools/metrics/histograms/histograms.xml

Project Member

Comment 14 by bugdroid1@chromium.org, Apr 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6c16897074391089d36b098902192f6845a11aa5

commit 6c16897074391089d36b098902192f6845a11aa5
Author: zhongyi <zhongyi@chromium.org>
Date: Wed Apr 19 23:40:23 2017

Split JobController count histograms into three separate histograms:
1. JobController for preconnect,
2. JobController for normal connect, has pending Request,
3. JobController for normal connect, has no Request.

The third one will give us better estimation of uncleaned JobControllers
and be used to detect memory leaks in JobController layer.

BUG= 704988 

Review-Url: https://codereview.chromium.org/2824313002
Cr-Commit-Position: refs/heads/master@{#465808}

[modify] https://crrev.com/6c16897074391089d36b098902192f6845a11aa5/net/http/http_stream_factory_impl.cc
[modify] https://crrev.com/6c16897074391089d36b098902192f6845a11aa5/tools/metrics/histograms/histograms.xml

Blocking: 711721
Status: Fixed (was: Assigned)
Closing this. I believe there's nothing needed for this bug.

Sign in to add a comment