New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 846871 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Task



Sign in to add a comment

Difference in OOM counts between old and new UMA metric

Project Member Reported by ssid@chromium.org, May 25 2018

Issue description

The old way to measure the foreground OOMs is Tab.RendererDetailedExitStatus. The logic which detects that we have OOMs is "Empty minidump in running app":

  if ((info.process_type == content::PROCESS_TYPE_RENDERER ||
       info.process_type == content::PROCESS_TYPE_GPU) &&
      info.app_state != base::android::APPLICATION_STATE_UNKNOWN)
    if (file_size == 0) {
      if (info.app_state == base::android::APPLICATION_STATE_HAS_RUNNING_ACTIVITIES) {
        exit_status = EMPTY_MINIDUMP_WHILE_RUNNING;
      } else if (info.app_state == base::android::APPLICATION_STATE_HAS_PAUSED_ACTIVITIES) {
        exit_status = EMPTY_MINIDUMP_WHILE_PAUSED;
  if (info.was_oom_protected_status) {
     UMA_HISTOGRAM_ENUMERATION("Tab.RendererDetailedExitStatus", exit_status);

The new metric to measure OOMs: Stability.Android.ProcessedCrashCounts with "Visible renderer foreground oom". This metric was added since we hvae OOPIF enabled and the old counts do not make sense.
The logic for the new metric is:

  if (info.process_type == content::PROCESS_TYPE_RENDERER) {
    if ((info.app_state ==
          base::android::APPLICATION_STATE_HAS_RUNNING_ACTIVITIES ||
      info.app_state == base::android::APPLICATION_STATE_HAS_PAUSED_ACTIVITIES)
         && info.renderer_has_visible_clients) {
      if (!info.was_killed_intentionally_by_browser && !has_crash_dump && !info.normal_termination) {
        if (!renderer_sub_frame)
            ProcessedCrashCounts::kRendererForegroundVisibleOom

The problem is the count given by the first UMA and second UMA does not match.
On high end devices the new metric is 5 times the older metric for OOMs. That is expected because we enabled OOPIF and the old one counts subframes as foreground crashes, and we used to count the first background tab OOM as foreground due to default binding. IIUC.

But on low end devices I still see twice as many hits on "Empty minidump in running app" + "Empty minidump in paused app", vs
"Visible renderer foreground oom" in new metric.
https://uma.googleplex.com/p/chrome/timeline_v2/?sid=14c30bafbbd567b24e47650ce4a23c33


IIUC, info.renderer_has_visible_clients check should not affect the counts since only the visible renderer would have strong binding with oom protected.

info.was_killed_intentionally_by_browser and info.normal_termination are the other newly added checks before recording OOMs.
But the recorded UMA for "Renderer intentional kill" does not account for much.

The old code checked for info.was_oom_protected_status. Why do we have so many different ways to check the same thing and why we do have differences in the final metrics?
 

Comment 1 by boliu@chromium.org, May 25 2018

A few other ideas..

* old metric does not filter out intentional kills (shouldn't matter much on low ram though)
* old metric does not filter out "normal terminations"
* oom protected status used by old metric also includes background renderers that are under binding management (although again, shouldn't matter on low ram devices)

So my guess is normal terminations.

>  Why do we have so many different ways to check the same thing

Well, every time someone wants to make a change, they have to deprecate the old metric and start using the new one.

RendererDetailedExitStatus isn't even the oldest one, that honor goes to RendererExitStatus (afaik..)

Comment 2 by boliu@chromium.org, May 25 2018

Clarification (from f2f chat): by "normal termination", I meant CrashDumpObserver::TerminationInfo::normal_termination, not TERMINATION_STATUS_NORMAL_TERMINATION.

TERMINATION_STATUS_NORMAL_TERMINATION is pretty meaningless on android. It just means "not TERMINATION_STATUS_OOM_PROTECTED"

CrashDumpObserver::TerminationInfo::normal_termination is what's you'd expect, ie where the browser is intending to shut down a renderer, sway if you swipe a tab away.

Comment 3 by ssid@chromium.org, May 29 2018

Cc: mariakho...@chromium.org
Project Member

Comment 4 by bugdroid1@chromium.org, May 30 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/691f54d6efaadc9c3ed2ec5f1a8c3c5186262a30

commit 691f54d6efaadc9c3ed2ec5f1a8c3c5186262a30
Author: Siddhartha <ssid@chromium.org>
Date: Wed May 30 17:59:25 2018

Record all types of foreground renderer crashes

The UMA did not record main frame intentional kill and normal
termination crash without minidump.

BUG=846871

Change-Id: I6d423aaedb19b8b08cb7265d5e0dffd1a3ff170d
Reviewed-on: https://chromium-review.googlesource.com/1077610
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Reviewed-by: Maria Khomenko <mariakhomenko@chromium.org>
Reviewed-by: Bo <boliu@chromium.org>
Commit-Queue: Siddhartha S <ssid@chromium.org>
Cr-Commit-Position: refs/heads/master@{#562906}
[modify] https://crrev.com/691f54d6efaadc9c3ed2ec5f1a8c3c5186262a30/components/crash/content/browser/crash_dump_manager_android.cc
[modify] https://crrev.com/691f54d6efaadc9c3ed2ec5f1a8c3c5186262a30/tools/metrics/histograms/enums.xml

Labels: -Type-Bug Stability-Memory Type-Task
Owner: ssid@chromium.org
Status: Assigned (was: Untriaged)
Project Member

Comment 6 by bugdroid1@chromium.org, Jun 7 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a3b92f701e93f9c11e89ae0662be541f8994c8d5

commit a3b92f701e93f9c11e89ae0662be541f8994c8d5
Author: Siddhartha <ssid@chromium.org>
Date: Thu Jun 07 15:19:35 2018

Android: Use same logic to detect OOM in different places

Make sure the same logic is used to detect renderer OOMs in various
places. Record UKM only for main frame OOM.
Also fixes the bug where UKM was recorded if renderer is not visible.
Move the metrics code out of CrashDumpManager to make it easy to switch
to crashpad.

Bug: 846871

Change-Id: I571cc762e8d3abe372f5b9f42f5b984614afa503
Reviewed-on: https://chromium-review.googlesource.com/1080278
Reviewed-by: Robert Sesek <rsesek@chromium.org>
Reviewed-by: Robert Kaplow <rkaplow@chromium.org>
Reviewed-by: Charlie Harrison <csharrison@chromium.org>
Reviewed-by: Ryan Sturm <ryansturm@chromium.org>
Reviewed-by: Maria Khomenko <mariakhomenko@chromium.org>
Commit-Queue: Siddhartha S <ssid@chromium.org>
Cr-Commit-Position: refs/heads/master@{#565269}
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/chrome/browser/android/oom_intervention/oom_intervention_tab_helper.cc
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/chrome/browser/android/oom_intervention/oom_intervention_tab_helper.h
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/chrome/browser/metrics/chrome_stability_metrics_provider.cc
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/chrome/browser/metrics/chrome_stability_metrics_provider.h
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/chrome/browser/metrics/oom/out_of_memory_reporter.cc
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/chrome/browser/metrics/oom/out_of_memory_reporter.h
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/chrome/browser/metrics/oom/out_of_memory_reporter_unittest.cc
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/crash/content/browser/BUILD.gn
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/crash/content/browser/crash_dump_manager_android.cc
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/crash/content/browser/crash_dump_manager_android.h
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/crash/content/browser/crash_dump_manager_android_unittest.cc
[add] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/crash/content/browser/crash_metrics_reporter_android.cc
[add] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/crash/content/browser/crash_metrics_reporter_android.h
[add] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/crash/content/browser/crash_metrics_reporter_android_unittest.cc
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/data_reduction_proxy/content/browser/data_reduction_proxy_pingback_client_impl.cc
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/data_reduction_proxy/content/browser/data_reduction_proxy_pingback_client_impl.h
[modify] https://crrev.com/a3b92f701e93f9c11e89ae0662be541f8994c8d5/components/data_reduction_proxy/content/browser/data_reduction_proxy_pingback_client_impl_unittest.cc

Project Member

Comment 7 by bugdroid1@chromium.org, Jun 12 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/04d50906eba8b2f3e419d25437e9ec8e42007e6a

commit 04d50906eba8b2f3e419d25437e9ec8e42007e6a
Author: Siddhartha <ssid@chromium.org>
Date: Tue Jun 12 22:26:53 2018

Fix synchronization issue in cached application state

|sCachedApplicationState| is set to null before updating sActivityInfo
and getApplicationState can be called from any thread. This means that
the sActivityInfo can be accessed from some other thread when being
updated and |sCachedApplicationState| can have wrong values.

Fix |sCachedApplicationState| to be updated synchronously with
|sActivityInfo| and the state is always consistent.

BUG=846871

Change-Id: Id658cc816d422f98240111fbdfa3b3475af1146b
Reviewed-on: https://chromium-review.googlesource.com/1094143
Commit-Queue: Siddhartha S <ssid@chromium.org>
Reviewed-by: Yaron Friedman <yfriedman@chromium.org>
Reviewed-by: David Trainor <dtrainor@chromium.org>
Cr-Commit-Position: refs/heads/master@{#566605}
[modify] https://crrev.com/04d50906eba8b2f3e419d25437e9ec8e42007e6a/base/android/java/src/org/chromium/base/ApplicationStatus.java

Project Member

Comment 8 by bugdroid1@chromium.org, Jun 13 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e847355b6a9f021d0c2f76a0a43dd3fc27b66980

commit e847355b6a9f021d0c2f76a0a43dd3fc27b66980
Author: Siddhartha <ssid@chromium.org>
Date: Wed Jun 13 21:48:13 2018

Record UMA to track OOMs when process is oom protected, but not visible

This can only happen on rare cases where the launcher thread is busy.
But there is a large difference between the old and new UMA. So, log
these cases too.

BUG=846871

Change-Id: I2c0d624bf3b60edccd2429e281745a0be86b88bb
Reviewed-on: https://chromium-review.googlesource.com/1092164
Reviewed-by: Bo <boliu@chromium.org>
Reviewed-by: Camille Lamy <clamy@chromium.org>
Reviewed-by: Tommy Nyquist <nyquist@chromium.org>
Reviewed-by: Maria Khomenko <mariakhomenko@chromium.org>
Commit-Queue: Siddhartha S <ssid@chromium.org>
Cr-Commit-Position: refs/heads/master@{#567003}
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/base/BUILD.gn
[add] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/base/android/child_process_binding_types.h
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/base/android/java/src/org/chromium/base/process_launcher/ChildProcessConnection.java
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/base/android/junit/src/org/chromium/base/process_launcher/ChildProcessConnectionTest.java
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/components/crash/content/browser/crash_dump_manager_android_unittest.cc
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/components/crash/content/browser/crash_dump_observer_android.cc
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/components/crash/content/browser/crash_dump_observer_android.h
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/components/crash/content/browser/crash_metrics_reporter_android.cc
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/components/crash/content/browser/crash_metrics_reporter_android.h
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/components/crash/content/browser/crash_metrics_reporter_android_unittest.cc
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/content/browser/child_process_launcher_helper_android.cc
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/content/public/android/java/src/org/chromium/content/browser/ChildProcessLauncherHelperImpl.java
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/content/public/browser/child_process_termination_info.h
[modify] https://crrev.com/e847355b6a9f021d0c2f76a0a43dd3fc27b66980/tools/metrics/histograms/enums.xml

Sign in to add a comment