New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 896142 link

Starred by 2 users

Issue metadata

Status: Duplicate
Merged: issue 729335
Owner: ----
Closed: Oct 17
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

Avoid low-memory handling frenzy on long first-kill latency

Project Member Reported by chinglinyu@chromium.org, Oct 17

Issue description

The memory pressure monitor for Chrome OS uses an 1-second interval check. It resends the notification if the system stays under critical memory pressure on next check. If, for any reason, the first handler dispatch runs for more than 1 sec and hasn't relieved the system from memory pressure yet, the memory pressure monitor sends another notification unconditionally. This is bad beacuse the system is so busy that it takes > 1 sec to handle the memory pressure, and we send another notification to make it busier. Take one user feedback for example:

1st notification:
2018-09-19 01:35:31.491 3 2250:2250: device_event_log_impl.cc(159): [01:35:31.491] Memory: tab_manager_delegate_chromeos.cc:571 List of low memory kill candidates (sorted from low priority to high priority):
... (list of tabs and apps)
... (kill until target memory to free is < 0)
2018-09-19 01:35:32.407 3 2250:2250: device_event_log_impl.cc(159): [01:35:32.407] Memory: tab_manager_delegate_chromeos.cc:584 Target memory to free: -66680 KB
2018-09-19 01:35:32.407 3 2250:2250: device_event_log_impl.cc(159): [01:35:32.407] Memory: tab_manager_delegate_chromeos.cc:649 Time to first kill 4.67197 s

Then follows 5 more notifications:
2018-09-19 01:35:32.428 3 2250:2250: device_event_log_impl.cc(159): [01:35:32.428] Memory: tab_manager_delegate_chromeos.cc:649 Time to first kill 4.61998 s
2018-09-19 01:35:32.443 3 2250:2250: device_event_log_impl.cc(159): [01:35:32.443] Memory: tab_manager_delegate_chromeos.cc:649 Time to first kill 3.76112 s
2018-09-19 01:35:32.647 3 2250:2250: device_event_log_impl.cc(159): [01:35:32.647] Memory: tab_manager_delegate_chromeos.cc:649 Time to first kill 2.98576 s
2018-09-19 01:35:32.657 3 2250:2250: device_event_log_impl.cc(159): [01:35:32.657] Memory: tab_manager_delegate_chromeos.cc:649 Time to first kill 2.97839 s
2018-09-19 01:35:32.674 3 2250:2250: device_event_log_impl.cc(159): [01:35:32.674] Memory: tab_manager_delegate_chromeos.cc:649 Time to first kill 1.98833 s

Then follows the result of 6 async dispatches of OOM memory details (tons of /proc/ system IO calls).
2018-09-19 01:35:32.723 4 2250:2250: oom_memory_details.cc(46): Tab Discards Memory details (4040 ms):
...
2018-09-19 01:35:32.724 4 2250:2250: oom_memory_details.cc(46): Tab Discards Memory details (3045 ms):
...
2018-09-19 01:35:32.768 4 2250:2250: oom_memory_details.cc(46): Tab Discards Memory details (3114 ms):
...
2018-09-19 01:35:32.801 4 2250:2250: oom_memory_details.cc(46): Tab Discards Memory details (2116 ms):
...
2018-09-19 01:35:33.422 4 2250:2250: oom_memory_details.cc(46): Tab Discards Memory details (702 ms):
...
2018-09-19 01:35:33.444 4 2250:2250: oom_memory_details.cc(46): Tab Discards Memory details (712 ms):
...

Note that the time span of 6 low memory handling calls is about 1.2 sec, where the 1st notification took the longest time (almost 1 sec). The subsequent 5 notifications probably ran under lighter low-mem conditions because they finished much faster than the 1st one, but each notification actually killed one (app or tab) process.

If the memory pressure is over when the handler runs, the handler call becomes a noop that kills nothing. All the cycles are wasted (on the UI thread).
2018-09-19 01:37:03.141 3 2250:2250: device_event_log_impl.cc(159): [01:37:03.139] Memory: tab_manager_delegate_chromeos.cc:571 List of low memory kill candidates (sorted from low priority to high priority):
... (the list of tabs and apps)
(No tab or app is killed)
2018-09-19 01:37:03.150 3 2250:2250: device_event_log_impl.cc(159): [01:37:03.150] Memory: tab_manager_delegate_chromeos.cc:584 Target memory to free: -3072 KB

If we look at the first-kill latencies, we can see that they overlapped in time. That is, the 2nd notification actually takes running of the 1st notification into account, and this applies to the subsequent notificaitons. This all resulted from the 1st notification creating a "traffic jam" in the thread-hopping of the low-mem handler calls, but the traffic souce, the memory pressure monitor, still adds more traffic to make the situation worse. Having first-kill latencies overlapping in time is also misleading from UMA metrics and makes the number look worse than it actually should be.

I think we should suppress the low-mem handler dispatch from the memory pressure monitor if an existing handler call is still running.
 
Labels: -OS-Android OS-Chrome
Mergedinto: 729335
Status: Duplicate (was: Untriaged)
Let's merge this issue to the original discussion in Issue 729335.

Sign in to add a comment