New issue
Advanced search Search tips

Issue 890335 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 8
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug

Blocked on:
issue 891178



Sign in to add a comment

low-memory-notify is not triggered properly on nyan boards

Project Member Reported by vovoy@chromium.org, Sep 28

Issue description

In the following nyan-blaze feedback report, oom killer is invoked because low-memory-notify is not triggered.
https://listnr.corp.google.com/product/208/report/85665344426

This issue can be reproduced by running platform_LowMemoryTest on nyan boards.
https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/testDetails?testName=platform_LowMemoryTest
 
Kernel trace in the feedback report:

[ 4523.788738] chrome invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=300
[ 4523.788754] CPU: 3 PID: 3724 Comm: chrome Tainted: G         C   3.10.18 #1
[ 4523.788776] [<c020cf48>] (unwind_backtrace+0x0/0x110) from [<c020a050>] (show_stack+0x20/0x24)
[ 4523.788789] [<c020a050>] (show_stack+0x20/0x24) from [<c07691e0>] (dump_stack+0x20/0x28)
[ 4523.788801] [<c07691e0>] (dump_stack+0x20/0x28) from [<c0768848>] (dump_header.isra.12+0x88/0x1b4)
[ 4523.788814] [<c0768848>] (dump_header.isra.12+0x88/0x1b4) from [<c02c7fa0>] (oom_kill_process+0xc0/0x404)
[ 4523.788824] [<c02c7fa0>] (oom_kill_process+0xc0/0x404) from [<c02c874c>] (out_of_memory+0x240/0x2e8)
[ 4523.788835] [<c02c874c>] (out_of_memory+0x240/0x2e8) from [<c02cbb90>] (__alloc_pages_nodemask+0x834/0x8b4)
[ 4523.788848] [<c02cbb90>] (__alloc_pages_nodemask+0x834/0x8b4) from [<c02e5474>] (handle_pte_fault+0x13c/0x7e8)
[ 4523.788859] [<c02e5474>] (handle_pte_fault+0x13c/0x7e8) from [<c02e6ab0>] (handle_mm_fault+0x120/0x154)
[ 4523.788868] [<c02e6ab0>] (handle_mm_fault+0x120/0x154) from [<c0213f08>] (do_page_fault+0x12c/0x390)
[ 4523.788878] [<c0213f08>] (do_page_fault+0x12c/0x390) from [<c02001d0>] (do_DataAbort+0x48/0xc4)
[ 4523.788889] [<c02001d0>] (do_DataAbort+0x48/0xc4) from [<c0205cb8>] (__dabt_usr+0x38/0x40)
[ 4523.788895] Exception stack(0xe661bfb0 to 0xe661bff8)
[ 4523.788902] bfa0:                                     ba24d000 00000000 ba2c8000 00006e75
[ 4523.788909] bfc0: b91e0960 ba19e2e8 ba24cff8 be936fb8 afe7cec0 00000007 253320e6 b9327c00
[ 4523.788915] bfe0: b18eab89 be936f70 afef3535 afef3538 a80f0030 ffffffff
[ 4523.788921] Mem-info:
[ 4523.788926] Normal per-cpu:
[ 4523.788932] CPU    0: hi:  186, btch:  31 usd:  27
[ 4523.788938] CPU    1: hi:  186, btch:  31 usd:  69
[ 4523.788943] CPU    2: hi:  186, btch:  31 usd:  57
[ 4523.788949] CPU    3: hi:  186, btch:  31 usd:  36
[ 4523.788954] HighMem per-cpu:
[ 4523.788959] CPU    0: hi:  186, btch:  31 usd:  87
[ 4523.788964] CPU    1: hi:  186, btch:  31 usd:  58
[ 4523.788970] CPU    2: hi:  186, btch:  31 usd:  64
[ 4523.788975] CPU    3: hi:  186, btch:  31 usd:  50
[ 4523.788985] active_anon:15085 inactive_anon:6790 isolated_anon:55
                active_file:21875 inactive_file:16699 isolated_file:0
                unevictable:0 dirty:6 writeback:2 unstable:0
                free:21052 slab_reclaimable:2574 slab_unreclaimable:9292
                mapped:44956 shmem:3623 pagetables:8378 bounce:0
                free_cma:0
[ 4523.789003] Normal free:83816kB min:3452kB low:4312kB high:5176kB active_anon:14096kB inactive_anon:14908kB active_file:15536kB inactive_file:15572kB unevictable:0kB isolated(anon):100kB isolated(file):0kB present:778240kB managed:746296kB mlocked:0kB dirty:0kB writeback:0kB mapped:27124kB shmem:11692kB slab_reclaimable:10296kB slab_unreclaimable:37168kB kernel_stack:6144kB pagetables:33512kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1122 all_unreclaimable? yes
[ 4523.789009] lowmem_reserve[]: 0 20094 20094
[ 4523.789033] HighMem free:392kB min:512kB low:3484kB high:6460kB active_anon:46244kB inactive_anon:12252kB active_file:71964kB inactive_file:51224kB unevictable:0kB isolated(anon):120kB isolated(file):0kB present:1286056kB managed:2572112kB mlocked:0kB dirty:24kB writeback:8kB mapped:152700kB shmem:2800kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:422108kB free_cma:0kB writeback_tmp:0kB pages_scanned:708 all_unreclaimable? yes
[ 4523.789040] lowmem_reserve[]: 0 0 0
[ 4523.789056] Normal: 13400*4kB (UEM) 3400*8kB (M) 7*16kB (R) 0*32kB 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 83728kB
[ 4523.789118] HighMem: 98*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 392kB
[ 4523.789165] 42454 total pagecache pages
[ 4523.789171] 229 pages in swap cache
[ 4523.789176] Swap cache stats: add 3165400, delete 3165171, find 50337/1337506
[ 4523.789181] Free swap  = 1908kB
[ 4523.789186] Total swap = 1985096kB
[ 4523.800118] 516074 pages of RAM
[ 4523.800129] 22418 free pages
[ 4523.800134] 8214 reserved pages
[ 4523.800139] 8826 slab pages
[ 4523.800144] 581269 pages shared
[ 4523.800149] 227 pages swap cached
Low memory notify is not triggered because there are a lot of free memory in the Normal zone (low memory). These Normal zone free pages cannot be used by user process because lowmem_reserve reserves ~80 MB pages.

[ 4523.789003] Normal free:83816kB min:3452kB low:4312kB high:5176kB active_anon:14096kB inactive_anon:14908kB active_file:15536kB inactive_file:15572kB unevictable:0kB isolated(anon):100kB isolated(file):0kB present:778240kB managed:746296kB mlocked:0kB dirty:0kB writeback:0kB mapped:27124kB shmem:11692kB slab_reclaimable:10296kB slab_unreclaimable:37168kB kernel_stack:6144kB pagetables:33512kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1122 all_unreclaimable? yes
...
[ 4523.789009] lowmem_reserve[]: 0 20094 20094

Possible solution:
Raise low-memory-notify margin to compensate lowmem_reserve.
I thought we subtracted out the reserved pages from available when we made the calculation -- maybe I'm missing something about how the zones are affecting this?
in get_available_mem_adj() in https://cs.corp.google.com/chromeos_public/src/third_party/kernel/v3.10/include/linux/low-mem-notify.h
The min_free_pages was substracted, but lowmem_reserve is not substracted.
Blockedon: 891178
UMA Arc.OOMKills.Count also shows worse OOM problem on nyan boards:

There are 0.14% user sessions experiencing 1 or more oom kills on all boards. [UMA on all boards]
There are 0.80% user sessions experiencing 1 or more oom kills on nyan_blaze board. [UMA on nyan_blaze board]

[UMA on all boards]:
https://uma.googleplex.com/p/chrome/histograms/?endDate=20181001&dayCount=1&histograms=Arc.OOMKills.Count&fixupData=true&showMax=true&filters=platform%2Ceq%2CC%2Cchannel%2Ceq%2C4%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial

[UMA on nyan_blaze board]:
https://uma.googleplex.com/p/chrome/histograms/?endDate=20181001&dayCount=1&histograms=Arc.OOMKills.Count&fixupData=true&showMax=true&filters=platform%2Ceq%2CC%2Cchannel%2Ceq%2C4%2Chw_class%2Ceq%2CNYAN_BLAZE%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial

Arc.OOMKills.Count is cumulative count of OOM kills in one user session. If there are 3 oom-kills in a user session, UMA of this session is "[0, 1): 1, [1, 2): 1, [2, 3): 1, [3, 4): 1". So in the UMA page, [1, 2) branket is the total number of sessions with 1 or more oom kills.
Project Member

Comment 7 by bugdroid1@chromium.org, Oct 5

Labels: merge-merged-chromeos-3.10
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/6cf6c12eaec831a8ea45197f161c1b6746cf00ba

commit 6cf6c12eaec831a8ea45197f161c1b6746cf00ba
Author: Kuo-Hsin Yang <vovoy@chromium.org>
Date: Fri Oct 05 22:43:49 2018

CHROMIUM: low_mem: exclude totalreserve_pages from available memory

totalreserve_pages is the reserve of pages that are not available to
userspace allocations. totalreserve_pages includes lowmem_reserve which
is higher on boards with highmem zones. Exclude totalreserve_pages
instead of min_free from available memory.

BUG= chromium:890335 
TEST=run platform_LowMemoryTest

Change-Id: I5568a5d438de334b22c62aa54cd82e90c868392e
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1260526
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Vovo Yang <vovoy@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/6cf6c12eaec831a8ea45197f161c1b6746cf00ba/include/linux/low-mem-notify.h

Project Member

Comment 8 by bugdroid1@chromium.org, Oct 7

Labels: merge-merged-chromeos-3.8
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e8b54f180c2d22dedff858f077ab1393615ccf37

commit e8b54f180c2d22dedff858f077ab1393615ccf37
Author: Kuo-Hsin Yang <vovoy@chromium.org>
Date: Sun Oct 07 18:55:04 2018

CHROMIUM: low_mem: exclude totalreserve_pages from available memory

totalreserve_pages is the reserve of pages that are not available to
userspace allocations. totalreserve_pages includes lowmem_reserve which
is higher on boards with highmem zones. Exclude totalreserve_pages
instead of min_free from available memory.

BUG= chromium:890335 
TEST=run platform_LowMemoryTest

Change-Id: I5568a5d438de334b22c62aa54cd82e90c868392e
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1260526
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Vovo Yang <vovoy@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 6cf6c12eaec831a8ea45197f161c1b6746cf00ba)
Reviewed-on: https://chromium-review.googlesource.com/1267355
Commit-Ready: Vovo Yang <vovoy@chromium.org>
Reviewed-by: Vovo Yang <vovoy@chromium.org>

[modify] https://crrev.com/e8b54f180c2d22dedff858f077ab1393615ccf37/include/linux/low-mem-notify.h

Status: Fixed (was: Assigned)
Project Member

Comment 10 by bugdroid1@chromium.org, Oct 8

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/f0e317fc76743ff68593218ca1545f8ffb9e0d08

commit f0e317fc76743ff68593218ca1545f8ffb9e0d08
Author: Kuo-Hsin Yang <vovoy@chromium.org>
Date: Mon Oct 08 11:16:24 2018

CHROMIUM: low_mem: exclude totalreserve_pages from available memory

totalreserve_pages is the reserve of pages that are not available to
userspace allocations. totalreserve_pages includes lowmem_reserve which
is higher on boards with highmem zones. Exclude totalreserve_pages
instead of min_free from available memory.

BUG= chromium:890335 
TEST=run platform_LowMemoryTest

Change-Id: I5568a5d438de334b22c62aa54cd82e90c868392e
Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1260526
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Vovo Yang <vovoy@chromium.org>
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
(cherry picked from commit 6cf6c12eaec831a8ea45197f161c1b6746cf00ba)
Reviewed-on: https://chromium-review.googlesource.com/1267775
Commit-Ready: Vovo Yang <vovoy@chromium.org>
Reviewed-by: Vovo Yang <vovoy@chromium.org>

[modify] https://crrev.com/f0e317fc76743ff68593218ca1545f8ffb9e0d08/include/linux/low-mem-notify.h

This has been a long-standing problem, correct?  Is there anything that suggests that this may correlated to a sudden growth in these memory health charts at the end of August and beginning of September?

https://dasnav.corp.google.com/dnsezg0l/#dimensions=board:240,channel:5,platform:17&dimensions=milestone:21&granularity=week&page=memory_health&start=P364D&view=default
yes it's a long standing problem

Sign in to add a comment