New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 710650 link

Starred by 2 users

Issue metadata

Status: Verified
Owner: ----
Closed: Jun 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug


Participants' hotlists:
Hotlist-1


Sign in to add a comment

caroline with new memory parameters gets OOM kills too soon

Project Member Reported by semenzato@chromium.org, Apr 11 2017

Issue description



User reports that system freezes for several seconds, then the screen blacks out for several more seconds, then comes back with several sad tabs.

I asked the user to type a triple alt-volup-X while the system was frozen.  This is the report:

https://feedback.corp.google.com/product/208/neutron?lView=rd&lRSort=1&lROrder=2&lRFilter=1&lReport=57282709650

First thing I notice in console-ramoops: OOM kills start way too soon, with lots of swap space available:

[   91.172977] atmel_mxt_ts i2c-ATML0001:00: Status: 00 Config Checksum: 06cb89
[  219.468263] entering low_mem (avail RAM = 409584 kB, avail swap 813608 kB) with lowest seen anon mem: 2122648 kB
[  226.663592] AudioOutputDevi invoked oom-killer: gfp_mask=0x2004d0, order=0, oom_score_adj=519

...

[  226.664121] Normal free:68908kB min:68960kB low:86200kB high:103440kB active_anon:764688kB inactive_anon:255200kB active_file:101880kB inactive_file:87044kB unevictable:0kB isolated(anon):2432kB isolated(file):0kB present:2080768kB managed:2007916kB mlocked:0kB dirty:4kB writeback:8kB mapped:228616kB shmem:373820kB slab_reclaimable:25684kB slab_unreclaimable:43000kB kernel_stack:12320kB pagetables:31864kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:6174124 all_unreclaimable? yes
[  226.664156] lowmem_reserve[]: 0 0 0 0
[  226.664166] DMA: 0*4kB 0*8kB 1*16kB (E) 2*32kB (UE) 1*64kB (E) 2*128kB (UE) 2*256kB (UE) 1*512kB (E) 2*1024kB (UE) 2*2048kB (UE) 2*4096kB (MR) = 15760kB
[  226.664205] DMA32: 5546*4kB (UEM) 3635*8kB (UM) 883*16kB (UEM) 1*32kB (E) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 2*4096kB (R) = 73616kB
[  226.664238] Normal: 6743*4kB (UEM) 3198*8kB (UEM) 476*16kB (UEM) 28*32kB (UEM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 2*4096kB (R) = 69260kB
[  226.664272] 288140 total pagecache pages
[  226.664278] 1022 pages in swap cache
[  226.664283] Swap cache stats: add 1966179, delete 1965157, find 4253/530087
[  226.664290] Free swap  = 820972kB
[  226.664295] Total swap = 3999996kB
[  226.664300] 1022443 pages RAM


Also, we had just entered low mem, so tab discarding should have started but there are no discards in the chrome log.


Then we get the alt-volup-X panic, attached below for convenience.  Most processes are blocked inside sys_poll().  Some are allocating.  Nothing stands out.


 
panic
41.6 KB View Download
Components: OS>Kernel
Labels: OS-Chrome
It seems like the discarder should be working according to crbug.com/705185  -- and this was run on  59.0.3065.0 canary -- so I believe it should have the fixes from that bug
Another example of premature OOM kills is in this report also from Chris:

https://feedback.corp.google.com/#/Report/57280301703

I wonder if the thread which is listening for the low memory notify can get blocked such that it doesn't get the notification immediately?
Cc: cylee@chromium.org
I don't think that there is a "listening" thread any longer, it's been replaced by a polling thread.  But the polling should be fairly frequent, about once a second.

re #5 -- hmm I wonder if that is really frequent enough or not -- or even why it's better to poll rather than listen for the signal in the kernel...
Polling is certainly worse for two reasons: 1. latency; 2. it prevents the system from quiescing.

We used to wait on a select().  The change to polling was made a few years back by skuhne.  I think tab discarding was added to other OSes as well, and those OSes don't have a low-memory notifier, so we lost this feature for the sake of unifying the code.

Summary: caroline with new memory parameters gets OOM kills too soon (was: caroline with new memory parameter gets OOM kills too soon)
Looking at the original report, I see dm_bufio stuff all over the place.  I'll make the same comments I did in bug #710857, comment #2.  Maybe someone can test and land:

  https://chromium-review.googlesource.com/c/423253/ - UPSTREAM: dm bufio: don't take the lock in dm_bufio_shrink_count
  https://chromium-review.googlesource.com/c/423252/ - UPSTREAM: dm bufio: drop the lock when doing GFP_NOIO allocation

...but I guess maybe we should keep this bug about the fact that the tab discarder isn't running properly in this case...
Oh amazing, we've seen bufio deadlocks since 2013 ( issue 248606 ).

I can test these on a caroline.  You don't have a specific test in mind, do you?  I haven't reproduced this on my caroline, but maybe Chris can help me do it.

Comment 11 by igo@chromium.org, Apr 13 2017

Sure. I can set up the original unit tomorrow and see if we can repro.
Cc: bccheng@chromium.org
Ben, we can probably close this, right?
Status: Fixed (was: Untriaged)
Marked as fixed since R61-9635.
Status: Verified (was: Fixed)
Not reproducible in Chrome OS 9690.0.0, 61.0.3138.0. 

Sign in to add a comment