low-memory-notify is not triggered properly on nyan boards |
||||||
Issue descriptionIn the following nyan-blaze feedback report, oom killer is invoked because low-memory-notify is not triggered. https://listnr.corp.google.com/product/208/report/85665344426 This issue can be reproduced by running platform_LowMemoryTest on nyan boards. https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/testDetails?testName=platform_LowMemoryTest
,
Sep 28
Low memory notify is not triggered because there are a lot of free memory in the Normal zone (low memory). These Normal zone free pages cannot be used by user process because lowmem_reserve reserves ~80 MB pages. [ 4523.789003] Normal free:83816kB min:3452kB low:4312kB high:5176kB active_anon:14096kB inactive_anon:14908kB active_file:15536kB inactive_file:15572kB unevictable:0kB isolated(anon):100kB isolated(file):0kB present:778240kB managed:746296kB mlocked:0kB dirty:0kB writeback:0kB mapped:27124kB shmem:11692kB slab_reclaimable:10296kB slab_unreclaimable:37168kB kernel_stack:6144kB pagetables:33512kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1122 all_unreclaimable? yes ... [ 4523.789009] lowmem_reserve[]: 0 20094 20094 Possible solution: Raise low-memory-notify margin to compensate lowmem_reserve.
,
Sep 28
I thought we subtracted out the reserved pages from available when we made the calculation -- maybe I'm missing something about how the zones are affecting this?
,
Sep 29
in get_available_mem_adj() in https://cs.corp.google.com/chromeos_public/src/third_party/kernel/v3.10/include/linux/low-mem-notify.h The min_free_pages was substracted, but lowmem_reserve is not substracted.
,
Oct 2
,
Oct 3
UMA Arc.OOMKills.Count also shows worse OOM problem on nyan boards: There are 0.14% user sessions experiencing 1 or more oom kills on all boards. [UMA on all boards] There are 0.80% user sessions experiencing 1 or more oom kills on nyan_blaze board. [UMA on nyan_blaze board] [UMA on all boards]: https://uma.googleplex.com/p/chrome/histograms/?endDate=20181001&dayCount=1&histograms=Arc.OOMKills.Count&fixupData=true&showMax=true&filters=platform%2Ceq%2CC%2Cchannel%2Ceq%2C4%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial [UMA on nyan_blaze board]: https://uma.googleplex.com/p/chrome/histograms/?endDate=20181001&dayCount=1&histograms=Arc.OOMKills.Count&fixupData=true&showMax=true&filters=platform%2Ceq%2CC%2Cchannel%2Ceq%2C4%2Chw_class%2Ceq%2CNYAN_BLAZE%2Cisofficial%2Ceq%2CTrue&implicitFilters=isofficial Arc.OOMKills.Count is cumulative count of OOM kills in one user session. If there are 3 oom-kills in a user session, UMA of this session is "[0, 1): 1, [1, 2): 1, [2, 3): 1, [3, 4): 1". So in the UMA page, [1, 2) branket is the total number of sessions with 1 or more oom kills.
,
Oct 5
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/6cf6c12eaec831a8ea45197f161c1b6746cf00ba commit 6cf6c12eaec831a8ea45197f161c1b6746cf00ba Author: Kuo-Hsin Yang <vovoy@chromium.org> Date: Fri Oct 05 22:43:49 2018 CHROMIUM: low_mem: exclude totalreserve_pages from available memory totalreserve_pages is the reserve of pages that are not available to userspace allocations. totalreserve_pages includes lowmem_reserve which is higher on boards with highmem zones. Exclude totalreserve_pages instead of min_free from available memory. BUG= chromium:890335 TEST=run platform_LowMemoryTest Change-Id: I5568a5d438de334b22c62aa54cd82e90c868392e Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1260526 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Tested-by: Vovo Yang <vovoy@chromium.org> Reviewed-by: Sonny Rao <sonnyrao@chromium.org> [modify] https://crrev.com/6cf6c12eaec831a8ea45197f161c1b6746cf00ba/include/linux/low-mem-notify.h
,
Oct 7
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e8b54f180c2d22dedff858f077ab1393615ccf37 commit e8b54f180c2d22dedff858f077ab1393615ccf37 Author: Kuo-Hsin Yang <vovoy@chromium.org> Date: Sun Oct 07 18:55:04 2018 CHROMIUM: low_mem: exclude totalreserve_pages from available memory totalreserve_pages is the reserve of pages that are not available to userspace allocations. totalreserve_pages includes lowmem_reserve which is higher on boards with highmem zones. Exclude totalreserve_pages instead of min_free from available memory. BUG= chromium:890335 TEST=run platform_LowMemoryTest Change-Id: I5568a5d438de334b22c62aa54cd82e90c868392e Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1260526 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Tested-by: Vovo Yang <vovoy@chromium.org> Reviewed-by: Sonny Rao <sonnyrao@chromium.org> (cherry picked from commit 6cf6c12eaec831a8ea45197f161c1b6746cf00ba) Reviewed-on: https://chromium-review.googlesource.com/1267355 Commit-Ready: Vovo Yang <vovoy@chromium.org> Reviewed-by: Vovo Yang <vovoy@chromium.org> [modify] https://crrev.com/e8b54f180c2d22dedff858f077ab1393615ccf37/include/linux/low-mem-notify.h
,
Oct 8
,
Oct 8
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/f0e317fc76743ff68593218ca1545f8ffb9e0d08 commit f0e317fc76743ff68593218ca1545f8ffb9e0d08 Author: Kuo-Hsin Yang <vovoy@chromium.org> Date: Mon Oct 08 11:16:24 2018 CHROMIUM: low_mem: exclude totalreserve_pages from available memory totalreserve_pages is the reserve of pages that are not available to userspace allocations. totalreserve_pages includes lowmem_reserve which is higher on boards with highmem zones. Exclude totalreserve_pages instead of min_free from available memory. BUG= chromium:890335 TEST=run platform_LowMemoryTest Change-Id: I5568a5d438de334b22c62aa54cd82e90c868392e Signed-off-by: Kuo-Hsin Yang <vovoy@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1260526 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Tested-by: Vovo Yang <vovoy@chromium.org> Reviewed-by: Sonny Rao <sonnyrao@chromium.org> (cherry picked from commit 6cf6c12eaec831a8ea45197f161c1b6746cf00ba) Reviewed-on: https://chromium-review.googlesource.com/1267775 Commit-Ready: Vovo Yang <vovoy@chromium.org> Reviewed-by: Vovo Yang <vovoy@chromium.org> [modify] https://crrev.com/f0e317fc76743ff68593218ca1545f8ffb9e0d08/include/linux/low-mem-notify.h
,
Oct 24
This has been a long-standing problem, correct? Is there anything that suggests that this may correlated to a sudden growth in these memory health charts at the end of August and beginning of September? https://dasnav.corp.google.com/dnsezg0l/#dimensions=board:240,channel:5,platform:17&dimensions=milestone:21&granularity=week&page=memory_health&start=P364D&view=default
,
Oct 30
yes it's a long standing problem |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by vovoy@chromium.org
, Sep 28Kernel trace in the feedback report: [ 4523.788738] chrome invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=300 [ 4523.788754] CPU: 3 PID: 3724 Comm: chrome Tainted: G C 3.10.18 #1 [ 4523.788776] [<c020cf48>] (unwind_backtrace+0x0/0x110) from [<c020a050>] (show_stack+0x20/0x24) [ 4523.788789] [<c020a050>] (show_stack+0x20/0x24) from [<c07691e0>] (dump_stack+0x20/0x28) [ 4523.788801] [<c07691e0>] (dump_stack+0x20/0x28) from [<c0768848>] (dump_header.isra.12+0x88/0x1b4) [ 4523.788814] [<c0768848>] (dump_header.isra.12+0x88/0x1b4) from [<c02c7fa0>] (oom_kill_process+0xc0/0x404) [ 4523.788824] [<c02c7fa0>] (oom_kill_process+0xc0/0x404) from [<c02c874c>] (out_of_memory+0x240/0x2e8) [ 4523.788835] [<c02c874c>] (out_of_memory+0x240/0x2e8) from [<c02cbb90>] (__alloc_pages_nodemask+0x834/0x8b4) [ 4523.788848] [<c02cbb90>] (__alloc_pages_nodemask+0x834/0x8b4) from [<c02e5474>] (handle_pte_fault+0x13c/0x7e8) [ 4523.788859] [<c02e5474>] (handle_pte_fault+0x13c/0x7e8) from [<c02e6ab0>] (handle_mm_fault+0x120/0x154) [ 4523.788868] [<c02e6ab0>] (handle_mm_fault+0x120/0x154) from [<c0213f08>] (do_page_fault+0x12c/0x390) [ 4523.788878] [<c0213f08>] (do_page_fault+0x12c/0x390) from [<c02001d0>] (do_DataAbort+0x48/0xc4) [ 4523.788889] [<c02001d0>] (do_DataAbort+0x48/0xc4) from [<c0205cb8>] (__dabt_usr+0x38/0x40) [ 4523.788895] Exception stack(0xe661bfb0 to 0xe661bff8) [ 4523.788902] bfa0: ba24d000 00000000 ba2c8000 00006e75 [ 4523.788909] bfc0: b91e0960 ba19e2e8 ba24cff8 be936fb8 afe7cec0 00000007 253320e6 b9327c00 [ 4523.788915] bfe0: b18eab89 be936f70 afef3535 afef3538 a80f0030 ffffffff [ 4523.788921] Mem-info: [ 4523.788926] Normal per-cpu: [ 4523.788932] CPU 0: hi: 186, btch: 31 usd: 27 [ 4523.788938] CPU 1: hi: 186, btch: 31 usd: 69 [ 4523.788943] CPU 2: hi: 186, btch: 31 usd: 57 [ 4523.788949] CPU 3: hi: 186, btch: 31 usd: 36 [ 4523.788954] HighMem per-cpu: [ 4523.788959] CPU 0: hi: 186, btch: 31 usd: 87 [ 4523.788964] CPU 1: hi: 186, btch: 31 usd: 58 [ 4523.788970] CPU 2: hi: 186, btch: 31 usd: 64 [ 4523.788975] CPU 3: hi: 186, btch: 31 usd: 50 [ 4523.788985] active_anon:15085 inactive_anon:6790 isolated_anon:55 active_file:21875 inactive_file:16699 isolated_file:0 unevictable:0 dirty:6 writeback:2 unstable:0 free:21052 slab_reclaimable:2574 slab_unreclaimable:9292 mapped:44956 shmem:3623 pagetables:8378 bounce:0 free_cma:0 [ 4523.789003] Normal free:83816kB min:3452kB low:4312kB high:5176kB active_anon:14096kB inactive_anon:14908kB active_file:15536kB inactive_file:15572kB unevictable:0kB isolated(anon):100kB isolated(file):0kB present:778240kB managed:746296kB mlocked:0kB dirty:0kB writeback:0kB mapped:27124kB shmem:11692kB slab_reclaimable:10296kB slab_unreclaimable:37168kB kernel_stack:6144kB pagetables:33512kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1122 all_unreclaimable? yes [ 4523.789009] lowmem_reserve[]: 0 20094 20094 [ 4523.789033] HighMem free:392kB min:512kB low:3484kB high:6460kB active_anon:46244kB inactive_anon:12252kB active_file:71964kB inactive_file:51224kB unevictable:0kB isolated(anon):120kB isolated(file):0kB present:1286056kB managed:2572112kB mlocked:0kB dirty:24kB writeback:8kB mapped:152700kB shmem:2800kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:422108kB free_cma:0kB writeback_tmp:0kB pages_scanned:708 all_unreclaimable? yes [ 4523.789040] lowmem_reserve[]: 0 0 0 [ 4523.789056] Normal: 13400*4kB (UEM) 3400*8kB (M) 7*16kB (R) 0*32kB 0*64kB 0*128kB 1*256kB (R) 1*512kB (R) 0*1024kB 1*2048kB (R) 0*4096kB = 83728kB [ 4523.789118] HighMem: 98*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 392kB [ 4523.789165] 42454 total pagecache pages [ 4523.789171] 229 pages in swap cache [ 4523.789176] Swap cache stats: add 3165400, delete 3165171, find 50337/1337506 [ 4523.789181] Free swap = 1908kB [ 4523.789186] Total swap = 1985096kB [ 4523.800118] 516074 pages of RAM [ 4523.800129] 22418 free pages [ 4523.800134] 8214 reserved pages [ 4523.800139] 8826 slab pages [ 4523.800144] 581269 pages shared [ 4523.800149] 227 pages swap cached