Devices get OOM when running kiosk apps (Rise, StratosMedia, Chrome Sign Builder) |
|||||||||||||||||||||||
Issue descriptionCHROMEOS_RELEASE_DESCRIPTION=7978.29.0 dev-channel zako test HP Zako rebooted while running kiosk App Rise Player 15.10.7.9312 (App id:odjaaghiehpobimgdjjfofmablbaleem)
,
Apr 15 2016
Zako's OS keep crashing, device has crashed three times in less than 24 hours while running a Kiosk App in longevity test. Please see attached logs.
,
Apr 15 2016
Did a quick look at the first couple of reboots in the logs attached to #2. I don't see any evidence of Chrome crash. The reboots all happened after a ssh login. It appears we are running longevity_Tracker auto test. Could the reboots be coming from the test?
,
Apr 15 2016
Thanks for checking, the longevity_Tracker test runs for 23 hrs and doesn't request any reboots. Please see the traces from the linux terminal after the test is started and a few hours later it crashes. https://docs.google.com/document/d/1DsrhAaBxM3coJiIdOkXPkwA_Mryj-6HIr9JhMbUzBi4/edit
,
Apr 15 2016
I cannot co-relate the autotest log with the logs in #2. The timestamp in autotest indicates test starts at 18:47:32, then DUT rebooted at 11:22:27 but I could not map this back to the diagnostic logs. And there is no crash found in auto test log. So chrome is not crashing. There must be some one (app, chrome, or test scripts?) explicitly requested the device to reboot.
,
Apr 15 2016
I feel that I should provide more information about the test that it is running. The Zako device is running a graphic intensive kiosk app (Rise Player, App id: mfpgpdablffhbfofnhlpgmokokbahooi) and the device is also running longevity_Tracker which collects performance data (CPU/memory utilization and temperature) for 23 hots straight and then ends. There has been a license problem with the kiosk where it stops and returns to the sign in screen after a few hours running.
,
Apr 15 2016
Are we seeing the license problem here (or any problems that cause the kiosk app to exit)? Unfortunately, we don't log restart from kiosk apps. But that seems to be plausible from what I see in the logs.
,
Apr 15 2016
yes, the license problem are making the App to stop running after a few hours of activity, the CPU utilization drops from about 40% and memory around 95% to about 1% respectively after the App quits.
,
Apr 18 2016
In this case, I'd say this is WAI since kiosk code is doing what it is supposed to do.
,
Apr 18 2016
Matt, Raj, Alex: would you get us the Rise Vision Rise Player license keys we need to continue Longevity testing on the Rise Player? We need at least two keys. More would be appreciated, as backups, in case one or more of them fails.
,
Apr 23 2016
From Rise: We have compiled two schedule types for typical and the more media intensive presentations. You can input the following Display IDs below. Display IDs Google_Testing_Content_Typical1 W9ZWFXZNRT9D Google_Testing_Content_Typical2 JRYVP9V62NQC Google_Testing_Content_Typical3 ZNENQYRPRD3B Google_Testing_Content_Intense1 UZ3BPGE55KK7 Google_Testing_Content_Intense2 UQ7Z4KHGVGG8 Schedule Google_Testing_Content_Typical1 W9ZWFXZNRT9D Google_Testing_Content_Typical2 JRYVP9V62NQC Google_Testing_Content_Typical3 ZNENQYRPRD3B Currently mix of Uptime and Content testing presentations Google_Testing_Content_Intense1 UZ3BPGE55KK7 Google_Testing_Content_Intense2 UQ7Z4KHGVGG8 Uses video content testing items and three instances of a presentation with two image galleries holding 120+ images
,
Apr 27 2016
Zako and Ninja devices crashed while running Rise Player with intense content: Zako is running Google_Testing_Content_Intense2 UQ7Z4KHGVGG8 Ninja is running Google_Testing_Content_Intense1 UZ3BPGE55KK7 please see attached logs
,
Apr 28 2016
The logs in #12 are 0 bytes.
,
Apr 28 2016
the device is running out of space and unable to generate logs, please see attachment
,
Apr 28 2016
Could the crash be caused by running out of disk space then? Would reboot help?
,
Apr 28 2016
I rebooted zako and was able to generate logs, please see attached. Thanks
,
Apr 28 2016
Are you running some oom test scripts? I saw this line in kernel log, roughtly before the 4-24, 17:44 reboot. 2016-04-27T17:32:03.241903-07:00 WARNING kernel: [ 2810.596145] autotest invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=-1000 DUT seems to be rebooted due to out of memory. There chrome kills due to oom: e.g. 2016-04-27T17:39:42.681971-07:00 ERR kernel: [ 3271.275989] Out of memory: Kill process 8224 (chrome) score 462 or sacrifice child 2016-04-27T17:39:42.681973-07:00 ERR kernel: [ 3271.276004] Killed process 8224 (chrome) total-vm:2064260kB, anon-rss:94240kB, file-rss:143396kB and ext4 process kills: 2016-04-27T17:32:03.242422-07:00 ERR kernel: [ 2810.607633] Out of memory: Kill process 7890 (Compositor) score 497 or sacrifice child 2016-04-27T17:32:03.242424-07:00 ERR kernel: [ 2810.607646] Killed process 7890 (Compositor) total-vm:2210868kB, anon-rss:380900kB, file-rss:269276kB 2016-04-27T17:32:03.242427-07:00 WARNING kernel: [ 2810.607773] Compositor: page allocation failure: order:0, mode:0x20058 2016-04-27T17:32:03.242429-07:00 NOTICE kernel: [ 2810.607786] Pid: 7890, comm: Compositor Tainted: G WC 3.8.11 #1 2016-04-27T17:32:03.242430-07:00 NOTICE kernel: [ 2810.607796] Call Trace: 2016-04-27T17:32:03.242432-07:00 NOTICE kernel: [ 2810.607810] [<ffffffff978bdc7e>] warn_alloc_failed+0x135/0x15f 2016-04-27T17:32:03.242434-07:00 NOTICE kernel: [ 2810.607822] [<ffffffff978c0393>] __alloc_pages_nodemask+0x547/0x692 2016-04-27T17:32:03.242436-07:00 NOTICE kernel: [ 2810.607837] [<ffffffff978b9cf7>] find_or_create_page+0x49/0x91 2016-04-27T17:32:03.242437-07:00 NOTICE kernel: [ 2810.607849] [<ffffffff97918675>] __getblk+0x171/0x26d 2016-04-27T17:32:03.242439-07:00 NOTICE kernel: [ 2810.607862] [<ffffffff97982a0a>] ext4_get_branch+0x78/0x117 2016-04-27T17:32:03.242443-07:00 NOTICE kernel: [ 2810.607874] [<ffffffff97982b98>] ext4_ind_map_blocks+0xef/0x513 2016-04-27T17:32:03.242445-07:00 NOTICE kernel: [ 2810.607887] [<ffffffff9786371a>] ? set_next_entity+0x44/0x9b 2016-04-27T17:32:03.242446-07:00 NOTICE kernel: [ 2810.607899] [<ffffffff9794f7e1>] ext4_map_blocks+0x68/0x22a 2016-04-27T17:32:03.242448-07:00 NOTICE kernel: [ 2810.607910] [<ffffffff9795174a>] _ext4_get_block+0xd6/0x171 2016-04-27T17:32:03.242449-07:00 NOTICE kernel: [ 2810.607921] [<ffffffff979517fb>] ext4_get_block+0x16/0x18 2016-04-27T17:32:03.242454-07:00 NOTICE kernel: [ 2810.607933] [<ffffffff9791f4c7>] do_mpage_readpage+0x1b1/0x50c ...
,
Apr 28 2016
it is currently running a performance test script (longevity_Tracker.py) to collect performance metrics: CPU and memory utilization and temperature.
,
Apr 28 2016
for reference you can take a look at the test data results: last week data: https://docs.google.com/spreadsheets/d/1s2g4UZOX4TZRnd3CEkxqOqq3DGUMtiknD7yFDJeVg78/edit#gid=1764424973 this week in progress: https://docs.google.com/spreadsheets/d/1A8ukWyx1_AI_3-eQ5kP6YNODlg_f9xXbo-U_Pa2FERw/edit#gid=0
,
Apr 29 2016
Ninja running rise player is also running out of memory and is unable to generate logs. Please see attachment
,
Apr 29 2016
When you say crash, what exactly did you see? The device rebooted? Or the screen freezes? Or the screen goes black? And is it a consistent repro for how? Do we roughly know how long it runs before crashing?
,
May 2 2016
,
May 2 2016
Issue 602517 has been merged into this issue.
,
May 2 2016
zako also runs out of memory while running Rise Player
,
May 2 2016
We run into OOM condition on multiple devices with different players (Rise, StratosMedia, Chrome Sign Builder etc). Chrome, the app or the video playback might have a memory leak somewhere. A closer look is needed. The problem is probably not kiosk specific, or device specific. But anyway, here are the devices seen the problem and merged with this issue: HP Zako, Veyron-Mickey, AOPEN Sumo
,
May 2 2016
From #24: All those 'No space left on device', they seem to imply running out of disk space as well?
,
May 3 2016
Those 'No space left on device' are all for /tmp, which is a tmpfs and in memory. It is actually another incarnation of out of memory.
,
May 3 2016
We are writing perf data to a CSV file in /usr/local/autotest/tmp/. Could this be causing the OOM?
,
May 3 2016
/usr/local is fine as that is backed by real disk. The error in #24 happens when we run generate_logs, which uses /tmp to dump and create log tgz file. And when the device in that state, it runs out of memory and /tmp has no free space. I wonder if we have any tool to ask Chrome to dump its heap and see where did the memory go.
,
May 5 2016
Any news on this?
,
May 6 2016
Hi Alex, can you please notify Rise player that their new content is using up all the device's memory until the App crashes, and if possible get new content for our longevity testing. This may also be happening out in the field. Thanks
,
May 9 2016
Sumo running Rise player Google_Testing_Content_Typical1 W9ZWFXZNRT9D crashes. please see logs
,
May 10 2016
Additional display ids from Rise (real world examples) Google_Content_Testing_ClientContent1 442P4TK6C9RB Google_Content_Testing_ClientContent2 8A57YUGEEK37 Google_Content_Testing_ClientContent3 5VQ94CAWC26 Rise is asking for which displayIDs are crashing, can you let me know which ones are causing frequent OOM issues?
,
May 10 2016
We are using UQ7Z4KHGVGG8 and UZ3BPGE55KK7 (high intensity content) for the longevity test. But crash also happens when using the low-intensity content (e.g., W9ZWFXZNRT9D).
,
May 11 2016
this issue is also occurring in M51-Beta build 8172.16.0, 51.0.2704.29. This bug has to be fixed before M51 goes to stable. @xiyuan please try to reproduce this bug, let me know if you need any help with that
,
May 11 2016
,
May 12 2016
,
May 12 2016
,
May 13 2016
latest build where this is happening: M51 8172.17.0 51.0.2704.30
,
May 14 2016
I ran the test from JRYVP9V62NQC locally on my peach_pit device and left it running for about 20 hours. Memory usage of the Webview renderer process kept increasing over time. Here are my observations of both the Webview and Browser processes: [1]: Browser -> 188 MB Webview -> 842 MB [2]: Browser -> 189 MB Webview -> 850 MB [3]: Browser -> 191 MB Webview -> 863 MB [4]: Browser -> 215 MB Webview -> 953 MB [5]: Browser -> 215 MB Webview -> 995 MB The file descriptor usage of both processes remained stable. When I tried to take a snapshot of the heap, the Webview process crashed because the memory usage increased beyond 1.2 GB to handle taking the JS heap snapshot. So, I assume if I left it running for few more hours, the memory usage would have kept increasing and would have crashed eventually. Next I tried to repro this "seems-like" memory leak in both Desktop Chrome on Linux and Chrome OS Linux build. It seems the issue is not reproducible! Both have been running since this morning till now, and the memory usage of the Webview is fluctuating around 500 MB. I plan to leave them both running during the weekend to have a definitive answer. But so far, it looks like there's a memory leak but not reproducible on Linux builds.
,
May 16 2016
I left three tests running over the weekend: - Google_Content_Testing_ClientContent1 442P4TK6C9RB Running on a peach_pit device. Memory usage reached 1132 MB. No crash yet. - Google_Testing_Content_Typical2 JRYVP9V62NQC Running on the Chrome OS Linux build. Memory usage reached 1001 MB. No crash yet. - Google_Testing_Content_Intense1 UZ3BPGE55KK7 Running on stable Desktop Chrome on Linux. License was revoked during weekend. Restarted the video this morning, and memory usage reached 800 MB within 30 minutes. No crashes. I have been unable to see any crashes due to OOM yet. How much memory the test devices have?
,
May 16 2016
This issue is about mickey originally 602517. Mickey rebooted while running Stratosmedia Player 2.0.31. Mickey's info: CHROMEOS_RELEASE_DESCRIPTION=8172.17.0 (Official Build) dev-channel veyron_mickey test. Please see logs
,
May 16 2016
From the posted logs in #43. There are a bunch of oom-killer events, and every time, there's only around 100 MB free memory. But what is interesting is that the killed chrome process (I assume this is the webview renderer) is not using so much memory as I see in my own tests, but the memory runs out because there appears to be other processes using a lot of memory at the same time, in particular the following: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name [ 6237] 1000 6237 330748 3609 343 15354 -1000 chrome [ 6322] 1000 6322 263461 75 8 110 -1000 nacl_helper_boo [ 6351] 1000 6351 161331 80948 253 5780 -1000 chrome [ 6679] 1000 6679 555497 4175 1100 248190 300 chrome <<<----(The oom killed process). How much RAM this veyron_mickey has?
,
May 16 2016
This is from veyron-mickey mickey ~ # cat /proc/meminfo MemTotal: 2067728 kB MemFree: 475468 kB MemAvailable: 1190660 kB Buffers: 23124 kB Cached: 801072 kB SwapCached: 0 kB Active: 535736 kB Inactive: 658036 kB Active(anon): 370252 kB Inactive(anon): 122084 kB Active(file): 165484 kB Inactive(file): 535952 kB Unevictable: 0 kB Mlocked: 0 kB HighTotal: 1317804 kB HighFree: 245860 kB LowTotal: 749924 kB LowFree: 229608 kB SwapTotal: 2019264 kB SwapFree: 2019264 kB Dirty: 152 kB Writeback: 0 kB AnonPages: 369672 kB Mapped: 399288 kB Shmem: 122764 kB Slab: 47992 kB SReclaimable: 32824 kB SUnreclaim: 15168 kB KernelStack: 2016 kB PageTables: 4804 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 3053128 kB Committed_AS: 1445172 kB VmallocTotal: 245760 kB VmallocUsed: 15500 kB VmallocChunk: 218236 kB
,
May 17 2016
That's not surprising. The veyron-micky has only 2 GB of RAM. This is too low to run those memory intensive players. By the way, I have been running three different tests since last Friday. No crashes yet. The memory usage seems to be stable around the 1 GB mark. That's already half of what veyron-micky has! What about the zako device? I suspect it also has too low memory as well, but please let me know. I think we should have a minimum requirement of at least 4 GB of RAM for any device that is planned to be used with those players. I'm lowering the priority of this bug, since we couldn't repro locally, and there doesn't seem to be a memory leak, but rather not enough resources on the test devices.
,
May 17 2016
here's tricky tricky ~ # cat /proc/meminfo MemTotal: 1922916 kB MemFree: 862244 kB Buffers: 113792 kB Cached: 561604 kB SwapCached: 0 kB Active: 388524 kB Inactive: 509652 kB Active(anon): 223336 kB Inactive(anon): 94376 kB Active(file): 165188 kB Inactive(file): 415276 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 2816768 kB SwapFree: 2816768 kB Dirty: 72 kB Writeback: 0 kB AnonPages: 222836 kB Mapped: 150064 kB Shmem: 94944 kB Slab: 111060 kB SReclaimable: 95800 kB SUnreclaim: 15260 kB KernelStack: 1520 kB PageTables: 6180 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 3778224 kB Committed_AS: 1172104 kB VmallocTotal: 34359738367 kB VmallocUsed: 366152 kB VmallocChunk: 34359369788 kB DirectMap4k: 58892 kB DirectMap2M: 1978368 kB DirectMap1G: 0 kB
,
May 17 2016
tricky also has less than 2 GB of RAM, and seems low for what those players need. +semenzato to help understand these kernel memory accountings better. From the messages logs that I pasted below, it's very obvious we are really running out of memory (avail RAM = 109492 kB, avail swap 197184 kB), however, the memory usage table of the running processes printed after that does not clearly show where all of that memory had gone. Could you please help clarify? INFO kernel: [235533.976018] entering low_mem (avail RAM = 109544 kB, avail swap 235252 kB) with lowest seen anon mem: 37984 kB INFO kernel: [235684.368333] entering low_mem (avail RAM = 109492 kB, avail swap 211952 kB) with lowest seen anon mem: 25004 kB INFO kernel: [235701.874148] entering low_mem (avail RAM = 109492 kB, avail swap 197184 kB) with lowest seen anon mem: 12548 kB WARNING kernel: [236260.954292] Chrome_IOThread invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=-1000 INFO kernel: [236260.954319] Chrome_IOThread cpuset=/ mems_allowed=0 NOTICE kernel: [236260.954332] CPU: 2 PID: 6340 Comm: Chrome_IOThread Tainted: G W 3.14.0 #1 NOTICE kernel: [236260.954376] [<c010e47c>] (unwind_backtrace) from [<c010a8a0>] (show_stack+0x20/0x24) NOTICE kernel: [236260.954396] [<c010a8a0>] (show_stack) from [<c067e834>] (dump_stack+0x7c/0xc0) NOTICE kernel: [236260.954413] [<c067e834>] (dump_stack) from [<c067e080>] (dump_header.isra.12+0x90/0x1d8) NOTICE kernel: [236260.954431] [<c067e080>] (dump_header.isra.12) from [<c01d9fdc>] (oom_kill_process+0x84/0x3c4) NOTICE kernel: [236260.954447] [<c01d9fdc>] (oom_kill_process) from [<c01da7dc>] (out_of_memory+0x298/0x340) NOTICE kernel: [236260.954461] [<c01da7dc>] (out_of_memory) from [<c01de0d0>] (__alloc_pages_nodemask+0x8c4/0x948) NOTICE kernel: [236260.954479] [<c01de0d0>] (__alloc_pages_nodemask) from [<c020aa54>] (read_swap_cache_async+0x60/0x1e4) NOTICE kernel: [236260.954494] [<c020aa54>] (read_swap_cache_async) from [<c020ad64>] (swapin_readahead+0x18c/0x1bc) NOTICE kernel: [236260.954511] [<c020ad64>] (swapin_readahead) from [<c01fb4d8>] (handle_mm_fault+0x23c/0x800) NOTICE kernel: [236260.954528] [<c01fb4d8>] (handle_mm_fault) from [<c0115084>] (do_page_fault+0x13c/0x3b0) NOTICE kernel: [236260.954542] [<c0115084>] (do_page_fault) from [<c01001d8>] (do_DataAbort+0x50/0xcc) NOTICE kernel: [236260.954556] [<c01001d8>] (do_DataAbort) from [<c010b458>] (__dabt_svc+0x38/0x60) NOTICE kernel: [236260.954567] Exception stack(0xde4b9e50 to 0xde4b9e98) NOTICE kernel: [236260.954577] 9e40: 00000001 d8576fc0 00000000 ffffffff NOTICE kernel: [236260.954589] 9e60: 00000000 de4b9eec de54fb8c b7d67e00 d7114840 de4b9f58 00000000 de4b9edc NOTICE kernel: [236260.954601] 9e80: 00000019 de4b9e98 c0607154 c0259d08 00000113 ffffffff NOTICE kernel: [236260.954621] [<c010b458>] (__dabt_svc) from [<c0259d08>] (ep_send_events_proc+0xd4/0x1a0) NOTICE kernel: [236260.954643] [<c0259d08>] (ep_send_events_proc) from [<c025a438>] (ep_scan_ready_list.isra.8+0xac/0x1dc) NOTICE kernel: [236260.954661] [<c025a438>] (ep_scan_ready_list.isra.8) from [<c025b6bc>] (SyS_epoll_wait+0x25c/0x35c) NOTICE kernel: [236260.954677] [<c025b6bc>] (SyS_epoll_wait) from [<c0106460>] (ret_fast_syscall+0x0/0x30) NOTICE kernel: [236260.954691] Mem-info: NOTICE kernel: [236260.954698] Normal per-cpu: NOTICE kernel: [236260.954706] CPU 0: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954714] CPU 1: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954724] CPU 2: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954731] CPU 3: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954739] HighMem per-cpu: NOTICE kernel: [236260.954746] CPU 0: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954754] CPU 1: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954763] CPU 2: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954770] CPU 3: hi: 186, btch: 31 usd: 0 NOTICE kernel: [236260.954784] active_anon:0 inactive_anon:28 isolated_anon:0 NOTICE kernel: [236260.954784] active_file:8049 inactive_file:10104 isolated_file:0 NOTICE kernel: [236260.954784] unevictable:0 dirty:0 writeback:0 unstable:0 NOTICE kernel: [236260.954784] free:11163 slab_reclaimable:2411 slab_unreclaimable:3840 NOTICE kernel: [236260.954784] mapped:89173 shmem:2 pagetables:2089 bounce:0 NOTICE kernel: [236260.954784] free_cma:0 NOTICE kernel: [236260.954840] Normal free:44220kB min:3460kB low:4324kB high:5188kB active_anon:0kB inactive_anon:48kB active_file:12784kB inactive_file:19100kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:778240kB managed:749924kB mlocked:0kB dirty:0kB writeback:0kB mapped:192476kB shmem:4kB slab_reclaimable:9644kB slab_unreclaimable:15360kB kernel_stack:1952kB pagetables:8356kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:97 all_unreclaimable? yes NOTICE kernel: [236260.954886] lowmem_reserve[]: 0 10295 10295 NOTICE kernel: [236260.954913] HighMem free:432kB min:512kB low:2032kB high:3552kB active_anon:0kB inactive_anon:64kB active_file:19412kB inactive_file:21316kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1318828kB managed:1317804kB mlocked:0kB dirty:0kB writeback:0kB mapped:164216kB shmem:4kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no NOTICE kernel: [236260.954945] lowmem_reserve[]: 0 0 0 NOTICE kernel: [236260.954958] Normal: 10612*4kB (UEM) 16*8kB (UR) 9*16kB (R) 2*32kB (R) 2*64kB (R) 1*128kB (R) 1*256kB (R) 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 44320kB NOTICE kernel: [236260.955013] HighMem: 117*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 468kB NOTICE kernel: [236260.955052] 18180 total pagecache pages NOTICE kernel: [236260.955060] 5 pages in swap cache NOTICE kernel: [236260.955067] Swap cache stats: add 73868478, delete 73868473, find 194347/34720597 NOTICE kernel: [236260.955076] Free swap = 163456kB NOTICE kernel: [236260.955081] Total swap = 2019264kB NOTICE kernel: [236260.987517] 524267 pages of RAM NOTICE kernel: [236260.987535] 11752 free pages NOTICE kernel: [236260.987540] 7335 reserved pages NOTICE kernel: [236260.987547] 3543 slab pages NOTICE kernel: [236260.987553] 1148759 pages shared NOTICE kernel: [236260.987559] 4 pages swap cached INFO kernel: [236260.987566] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name INFO kernel: [236260.987600] [ 125] 0 125 685 72 4 111 -1000 udevd INFO kernel: [236260.987617] [ 337] 202 337 9710 129 11 212 -1000 rsyslogd INFO kernel: [236260.987636] [ 395] 201 395 682 108 5 170 -1000 dbus-daemon INFO kernel: [236260.987653] [ 446] 0 446 371 57 3 18 -1000 agetty INFO kernel: [236260.987678] [ 510] 0 510 420 44 3 28 -1000 minijail0 INFO kernel: [236260.987703] [ 514] 219 514 1319 205 5 182 -1000 wpa_supplicant INFO kernel: [236260.987721] [ 516] 229 516 392 51 3 55 -1000 daisydog INFO kernel: [236260.987738] [ 752] 0 752 420 44 4 28 -1000 minijail0 INFO kernel: [236260.987757] [ 759] 228 759 4416 326 8 166 -1000 powerd INFO kernel: [236260.987775] [ 1309] 0 1309 1536 257 6 88 -1000 firewalld INFO kernel: [236260.987789] [ 1313] 0 1313 420 44 4 28 -1000 minijail0 INFO kernel: [236260.987805] [ 1322] 230 1322 2010 285 6 195 -1000 permission_brok INFO kernel: [236260.987825] [ 1337] 0 1337 3154 577 8 415 -1000 shill INFO kernel: [236260.987839] [ 1362] 202 1362 396 49 3 23 -1000 logger INFO kernel: [236260.987857] [ 1715] 0 1715 365 44 3 49 -1000 periodic_schedu INFO kernel: [236260.987872] [ 1722] 0 1722 365 50 4 45 -1000 periodic_schedu INFO kernel: [236260.987887] [ 1738] 0 1738 420 52 3 29 -1000 minijail0 INFO kernel: [236260.987904] [ 1741] 0 1741 365 46 3 46 -1000 periodic_schedu INFO kernel: [236260.987918] [ 1742] 226 1742 4015 183 6 129 -1000 mtpd INFO kernel: [236260.987933] [ 1770] 0 1770 420 44 3 28 -1000 minijail0 INFO kernel: [236260.987947] [ 1845] 241 1845 8426 82 9 155 -1000 ModemManager INFO kernel: [236260.987962] [ 1851] 0 1851 315 0 2 15 -1000 brcm_patchram_p INFO kernel: [236260.987976] [ 1852] 0 1852 420 44 4 28 -1000 minijail0 INFO kernel: [236260.987993] [ 1855] 0 1855 2161 229 7 134 -1000 metrics_daemon INFO kernel: [236260.988007] [ 1876] 218 1876 931 122 5 83 -1000 bluetoothd INFO kernel: [236260.988022] [ 1916] 0 1916 996 49 4 86 -1000 sshd INFO kernel: [236260.988036] [ 1927] 600 1927 3315 158 5 154 -1000 cras INFO kernel: [236260.988052] [ 1974] 0 1974 4545 154 9 203 -1000 disks INFO kernel: [236260.988065] [ 2131] 238 2131 639 160 5 112 -1000 avahi-daemon INFO kernel: [236260.988081] [ 2132] 238 2132 639 26 4 51 -1000 avahi-daemon INFO kernel: [236260.988097] [ 2151] 0 2151 1982 211 6 718 -1000 python INFO kernel: [236260.988112] [ 2306] 0 2306 2347 288 7 153 -1000 update_engine INFO kernel: [236260.988130] [ 2351] 0 2351 1686 61 7 89 -1000 warn_collector INFO kernel: [236260.988145] [ 2374] 0 2374 365 37 3 15 -1000 sh INFO kernel: [236260.988160] [ 2438] 0 2438 420 53 4 29 -1000 minijail0 INFO kernel: [236260.988175] [ 2473] 232 2473 1530 152 6 123 -1000 netfilter-queue INFO kernel: [236260.988191] [ 2490] 234 2490 871 129 5 82 -1000 tlsdated INFO kernel: [236260.988208] [ 2491] 0 2491 351 52 3 18 -1000 logger INFO kernel: [236260.988225] [ 2505] 0 2505 853 14 4 60 -1000 tlsdated-setter INFO kernel: [236260.988241] [ 2515] 0 2515 365 45 3 48 -1000 periodic_schedu INFO kernel: [236260.988260] [ 2808] 224 2808 531 143 3 82 -1000 dhcpcd INFO kernel: [236260.988275] [12235] 207 12235 10956 76 11 163 -1000 tcsd INFO kernel: [236260.988291] [12238] 223 12238 8990 250 11 181 -1000 chapsd INFO kernel: [236260.988307] [12247] 0 12247 5825 230 10 308 -1000 cryptohomed INFO kernel: [236260.988323] [ 6205] 0 6205 2910 311 8 274 -1000 session_manager INFO kernel: [236260.988337] [ 6223] 0 6223 1906 167 6 91 -1000 debugd INFO kernel: [236260.988350] [ 6237] 1000 6237 330748 3609 343 15354 -1000 chrome INFO kernel: [236260.988364] [ 6321] 1000 6321 42163 507 56 1204 -1000 chrome INFO kernel: [236260.988377] [ 6322] 1000 6322 263461 75 8 110 -1000 nacl_helper_boo INFO kernel: [236260.988391] [ 6324] 1000 6324 984 0 4 35 -1000 nacl_helper_non INFO kernel: [236260.988404] [ 6327] 1000 6327 42163 84 29 1215 -1000 chrome INFO kernel: [236260.988417] [ 6351] 1000 6351 161331 80948 253 5780 -1000 chrome INFO kernel: [236260.988430] [ 6385] 1000 6385 25760 123 33 1027 -1000 chrome INFO kernel: [236260.988443] [ 6679] 1000 6679 555497 4175 1100 248190 300 chrome INFO kernel: [236260.988462] [13089] 0 13089 602 49 3 25 -1000 sleep INFO kernel: [236260.988475] [13090] 0 13090 602 49 3 25 -1000 sleep INFO kernel: [236260.988488] [13094] 0 13094 602 49 3 25 -1000 sleep INFO kernel: [236260.988501] [13098] 0 13098 602 49 3 25 -1000 sleep ERR kernel: [236260.988513] Out of memory: Kill process 6679 (chrome) score 547 or sacrifice child ERR kernel: [236260.988530] Killed process 6679 (chrome) total-vm:2221988kB, anon-rss:0kB, file-rss:16700kB
,
May 17 2016
Yes, the kernel is really OOM and it is working correctly by killing the chrome process as shown in #48. Note that the available swap number is from 550 seconds earlier than the OOM kill---however, there is little free swap left also then. You're right that the memory usage doesn't add up. The total_vm fields add up to about 1.5GB. Is it possible that the test infrastructure, or something else, is writing large files in /tmp? It's a RAM-based file system so it could end up using a lot of memory. It would be useful to monitor /proc/meminfo during the run, see if we notice trends.
,
May 17 2016
juanramon@, as in #49, are we writing files in /tmp while the tests are running?
,
May 18 2016
not anymore, files are written in the /usr/local/autotest/tmp/ folder as a csv file. when the files are written in /tmp folder they get deleted if there is a unexpected reboot.
,
May 18 2016
Another possibility is that memory compression doesn't work well with this workload.
Would it be possible to log the content of these files every minute or so:
/sys/block/zram0/{compr_data_size,orig_data_size,memory_used_total,zero_pages}
Actually we just need a snapshot while the workload is running. Normally the compressed data size is about 1/3 of the original data size. If it's a lot worse than that, we have a problem.
,
May 18 2016
can you ssh into the machine? chromeos1-row2-rack11-host6 or chromeos1-row2-rack5-host2
,
May 18 2016
also ip address works: 172.27.213.26 or 172.27.212.211
,
May 18 2016
both machines are currently running the content
,
May 19 2016
I have just ssh'ed into 172.27.213.26. That device is really running low on memory. In particular, process with pid=9485, is using a lot of memory. Can you please open the task manager on that device (Search+Esc) and let me know the title of the task(s) running on the process with that PID?
,
May 19 2016
Also there are about 468 MB worth of files in the (backed-by-memory) /tmp/ directory! That's a lot! The question is who writes those files? Examples of the biggest files are: File Size .com.google.Chrome.JCHSC2 7660186 .com.google.Chrome.VR8DB1 7660186 .com.google.Chrome.b3wNDA 7660186 .com.google.Chrome.n9OTzF 7660186 .com.google.Chrome.tq7WgY 7660186 .com.google.Chrome.N9Kk2T 6350334 .com.google.Chrome.Osv7Tw 6350334 .com.google.Chrome.j5Kck6 6350334 .com.google.Chrome.x7nw1i 6350334 .com.google.Chrome.1Snyse 5717393 .com.google.Chrome.9W6kID 5717393 .com.google.Chrome.Jlepze 5717393 .com.google.Chrome.OQLAAg 5717393 .com.google.Chrome.K8rgZU 5683421 .com.google.Chrome.LUZjGu 5683421 .com.google.Chrome.X8oA92 5683421 .com.google.Chrome.ZjSQIQ 5683421 .com.google.Chrome.ANAaVY 5210085 .com.google.Chrome.Tuwi5u 5210085 .com.google.Chrome.f8qd7H 5210085 .com.google.Chrome.xlbvq7 5210085 .com.google.Chrome.5ISDo2 3508747 .com.google.Chrome.LIGjLz 3508747 .com.google.Chrome.TRFfO1 3508747 .com.google.Chrome.UZ9eiN 3508747 .com.google.Chrome.VKQE6t 3508747 .com.google.Chrome.WEqk7Q 3508747 .com.google.Chrome.lsQgM1 3508747 .com.google.Chrome.ziGrow 3508747 .com.google.Chrome.5Agq8u 3266652 .com.google.Chrome.FTmESA 3266652 .com.google.Chrome.L6rZOH 3266652 .com.google.Chrome.xUx0nc 3266652 .com.google.Chrome.xgzLhj 3266652 .com.google.Chrome.9J1elW 3225784 .com.google.Chrome.GChzfn 3225784 .com.google.Chrome.dShaAX 3225784 .com.google.Chrome.nDTaQt 3225784 .com.google.Chrome.bSTt89 3034029 .com.google.Chrome.hDAw7z 3034029 .com.google.Chrome.pW6ZCM 3034029 .com.google.Chrome.wRCQPC 3034029 .com.google.Chrome.42g4b0 2285843 .com.google.Chrome.LXKqMv 2285843 .com.google.Chrome.lMHnjH 2285843
,
May 19 2016
Aha! #57 is the culprit. #55: sorry I just tried this, the first host is compressing 120MB into 40MB, the second host is not compressing, everything else is fine. I am fairly sure that the problem is the large amount of files in /tmp.
,
May 19 2016
Does anyone have a clue where these types of files are coming from?
,
May 19 2016
Judging from the names, I believe it's chrome. See https://code.google.com/p/chromium/codesearch#chromium/src/base/files/file_util_posix.cc&q=CreateAndOpenFdForTemporaryFile&sq=package:chromium&type=cs&l=138 There are many callers to base::CreateTemporaryFile(). It's hard to know which one exactly.
,
May 19 2016
Suspect the files are created from base::CreateTemporaryFile. But there are quite some callers of it. Can we peek at the contents of the file and see if that gives us some hints?
,
May 19 2016
Most of the callers seem to be in test files, though! Where do these digital signage tests reside?
,
May 19 2016
I don't think my longevity_Tracker.py (https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/client/site_tests/longevity_Tracker/longevity_Tracker.py) script is creating them. At least, I don't see any obvious calls to create temporary files. I propose we delete the files. Then run the script with no Kiosk App, and see if the script creates them on it's own. Then stop the script, and run the Kiosk App on it's own, and see if the App creates them on it's own.
,
May 19 2016
in response to comment 63 I have 100.96.49.93 running the script only without the App, I'll check the the tpm folder to see if these files creep up
,
May 20 2016
the zako with IP 172.27.213.26 running riseplayer and the test script crashed but before it crashed it filled the /tmp folder with a few hundred .com.google.Chrome.xxxx as such -rw------- 1 chronos chronos 0 May 19 15:53 .com.google.Chrome.00N5IV -rw------- 1 chronos chronos 245798 May 19 15:48 .com.google.Chrome.02zbTz -rw------- 1 chronos chronos 233903 May 19 16:20.com.google.Chrome.04X0iY.. etc by contrast the Sumo with ip 100.96.49.93 running the script does not have any of these files in its /tmp folder
,
May 20 2016
Try to check the contents of these temp files, maybe that can tell you something about their source.
,
May 20 2016
they are still images used during the show and they are being stored in the /tmp directory.
,
May 20 2016
Oh, I saw this image while running one of the test videos! Ok, so in this case, the player itself is caching these images in /tmp/ ?
,
May 20 2016
The player should not have access to /tmp directly. Sounds like Chrome is creating those files on behalf of the player. Do you see these files created on the dev box? If so, would closing the play make them go away? And if we can make it happen for a debug build, maybe log the call stack to CreateTemporaryFile?
,
May 20 2016
Yes I can see the files in the /tmp directory of the dev machine (ip 100.96.49.97)not running the python script.
,
May 21 2016
I tested again the Rise Player today. Chrome really creates tons of temp files while this app is running. +rdsmith: Who might provide some insights. Here's the trace: #1 0x2b30547c807b af::FuncMarker::GetStackTrace() #2 0x2b30547ca809 base::(anonymous namespace)::TempFileName() #3 0x2b30547ca186 base::(anonymous namespace)::CreateAndOpenFdForTemporaryFile() #4 0x2b30547ca27d base::CreateAndOpenTemporaryFileInDir() #5 0x2b30547f816f base::(anonymous namespace)::CreateAnonymousSharedMemory() #6 0x2b30547f7610 base::SharedMemory::Create() #7 0x2b30547f9929 base::SharedMemory::CreateAnonymous() #8 0x2b30547f73af base::SharedMemory::CreateAndMapAnonymous() #9 0x2b3057fb8231 content::ResourceBuffer::Initialize() #10 0x2b3057f92151 content::AsyncResourceHandler::EnsureResourceBufferIsInitialized() #11 0x2b3057f91cf1 content::AsyncResourceHandler::OnWillRead() #12 0x2b3057fa2078 content::MimeTypeResourceHandler::OnWillRead() #13 0x2b3057fa0ece content::LayeredResourceHandler::OnWillRead() #14 0x2b3057fefe3a content::ResourceLoader::ReadMore() #15 0x2b3057fee06f content::ResourceLoader::StartReading() #16 0x2b3057fed9f4 content::ResourceLoader::OnResponseStarted() #17 0x2b305e46864f net::URLRequest::NotifyResponseStarted() #18 0x2b305e49b6b7 net::URLRequestJob::NotifyHeadersComplete() #19 0x2b305e4904a6 net::URLRequestHttpJob::NotifyHeadersComplete() #20 0x2b305e49216e net::URLRequestHttpJob::SaveCookiesAndNotifyHeadersComplete() #21 0x2b305e48e166 net::URLRequestHttpJob::OnStartCompleted() #22 0x2b305dd7babb _ZN4base8internal15RunnableAdapterIMN3net18ClientSocketHandleEFviEE3RunIPS3_JiEEEvOT_DpOT0_ #23 0x2b305e49714e _ZN4base8internal12InvokeHelperILb0EvNS0_15RunnableAdapterIMN3net17URLRequestHttpJobEFviEEEE8MakeItSoIJPS4_iEEEvS7_DpOT_ #24 0x2b305e4970fd _ZN4base8internal7InvokerINS_13IndexSequenceIJLm0EEEENS0_9BindStateINS0_15RunnableAdapterIMN3net17URLRequestHttpJobEFviEEEFvPS7_iEJNS0_17UnretainedWrapperIS7_EEEEENS0_12InvokeHelperILb0EvSA_EEFviEE3RunEPNS0_13BindStateBaseEOi #25 0x2b305dd7b532 base::Callback<>::Run() #26 0x2b305e079fe2 net::HttpCache::Transaction::DoLoop() #27 0x2b305e07814b net::HttpCache::Transaction::OnIOComplete() #28 0x2b305dd7babb _ZN4base8internal15RunnableAdapterIMN3net18ClientSocketHandleEFviEE3RunIPS3_JiEEEvOT_DpOT0_ #29 0x2b305df97f45 _ZN4base8internal12InvokeHelperILb1EvNS0_15RunnableAdapterIMN10disk_cache11SimpleIndexEFviEEEE8MakeItSoINS_7WeakPtrIS4_EEJiEEEvS7_T_DpOT0_ #30 0x2b305e08a64d _ZN4base8internal7InvokerINS_13IndexSequenceIJLm0EEEENS0_9BindStateINS0_15RunnableAdapterIMN3net9HttpCache11TransactionEFviEEEFvPS8_iEJNS_7WeakPtrIS8_EEEEENS0_12InvokeHelperILb1EvSB_EEFviEE3RunEPNS0_13BindStateBaseEOi #31 0x2b305dd7b532 base::Callback<>::Run() #32 0x2b305e09f60e net::HttpNetworkTransaction::DoCallback() #33 0x2b305e09b1a8 net::HttpNetworkTransaction::OnIOComplete() #34 0x2b305dd7babb _ZN4base8internal15RunnableAdapterIMN3net18ClientSocketHandleEFviEE3RunIPS3_JiEEEvOT_DpOT0_ #35 0x2b305e0a805e _ZN4base8internal12InvokeHelperILb0EvNS0_15RunnableAdapterIMN3net22HttpNetworkTransactionEFviEEEE8MakeItSoIJPS4_iEEEvS7_DpOT_ #36 0x2b305e0a800d _ZN4base8internal7InvokerINS_13IndexSequenceIJLm0EEEENS0_9BindStateINS0_15RunnableAdapterIMN3net22HttpNetworkTransactionEFviEEEFvPS7_iEJNS0_17UnretainedWrapperIS7_EEEEENS0_12InvokeHelperILb0EvSA_EEFviEE3RunEPNS0_13BindStateBaseEOi #37 0x2b305dd7b532 base::Callback<>::Run() #38 0x2b305e3d5847 net::SpdyHttpStream::DoResponseCallback() #39 0x2b305e3d56c1 net::SpdyHttpStream::OnResponseHeadersUpdated() #40 0x2b305e425af5 net::SpdyStream::MergeWithResponseHeaders() #41 0x2b305e4255ef net::SpdyStream::OnInitialResponseHeadersReceived() #42 0x2b305e3f5b87 net::SpdySession::OnInitialResponseHeadersReceived() #43 0x2b305e3f8a63 net::SpdySession::OnHeaders() #44 0x2b305e38bc8b net::BufferedSpdyFramer::OnControlFrameHeaderData() #45 0x2b305e3bda9a net::SpdyFramer::ProcessControlFrameHeaderBlock() #46 0x2b305e3c59c1 net::SpdyFramer::DeliverHpackBlockAsSpdy3Block() #47 0x2b305e3bda38 net::SpdyFramer::ProcessControlFrameHeaderBlock() #48 0x2b305e3b9b22 net::SpdyFramer::ProcessInput() #49 0x2b305e38ca24 net::BufferedSpdyFramer::ProcessInput() #50 0x2b305e3ef141 net::SpdySession::DoReadComplete() #51 0x2b305e3ee706 net::SpdySession::DoReadLoop() #52 0x2b305e3e8e2d net::SpdySession::PumpReadLoop() #53 0x2b305e40c230 _ZN4base8internal15RunnableAdapterIMN3net11SpdySessionEFvNS3_9ReadStateEiEE3RunIPS3_JRKS4_iEEEvOT_DpOT0_ #54 0x2b305e40c14a _ZN4base8internal12InvokeHelperILb1EvNS0_15RunnableAdapterIMN3net11SpdySessionEFvNS4_9ReadStateEiEEEE8MakeItSoINS_7WeakPtrIS4_EEJRKS5_iEEEvS8_T_DpOT0_ #55 0x2b305e40c0cd _ZN4base8internal7InvokerINS_13IndexSequenceIJLm0ELm1EEEENS0_9BindStateINS0_15RunnableAdapterIMN3net11SpdySessionEFvNS7_9ReadStateEiEEEFvPS7_S8_iEJNS_7WeakPtrIS7_EES8_EEENS0_12InvokeHelperILb1EvSB_EEFviEE3RunEPNS0_13BindStateBaseEOi #56 0x2b305dd7b532 base::Callback<>::Run() #57 0x2b305dd830ab net::SSLClientSocketImpl::DoReadCallback() #58 0x2b305dd86427 net::SSLClientSocketImpl::OnRecvComplete() #59 0x2b305dd871ee net::SSLClientSocketImpl::BufferRecvComplete() #60 0x2b305dd7babb _ZN4base8internal15RunnableAdapterIMN3net18ClientSocketHandleEFviEE3RunIPS3_JiEEEvOT_DpOT0_ #61 0x2b305dd95cce _ZN4base8internal12InvokeHelperILb0EvNS0_15RunnableAdapterIMN3net19SSLClientSocketImplEFviEEEE8MakeItSoIJPS4_iEEEvS7_DpOT_
,
May 23 2016
I'm not sure why you're pinging me; if you let me know what particular expertise you're expecting of me I might be able to help more :-}. Glancing at the last stacktrace, what it naively looks like is happening is that shared memory between renderer and browser is being implemented by both renderer and browser mmaping a file and scribbling into it. If that file is being mmap from a memory backed filesystem and the kernel isn't written correctly, you might have double the needed memory used. But my guess is that that's not what's going on--my guess is that Chrome isn't sizing the shared memory segments for communication properly and/or isn't cleaning them up properly. But you should talk to an IPC/memory allocation expert about that. I've cc'd Erik Chen, since his fingerprints are on the shared memory allocation. My apologies if you wanted my perspective on some other aspect of this bug--feel free to redirect me.
,
May 23 2016
Re #72: Sorry, I should have been more specific. I CC'd you as a I saw you're an owner of both dirs where net::URLRequestHttpJob::OnStartCompleted() and content::ResourceBuffer::Initialize() are located. I just wanted to know the need for all these mmaps and why aren't they being cleaned up to the point the app gets killed by the oom killer? Is it a leak in Chrome itself? Or is it that the app is simply using a lot of memory? Thanks for adding Erik Chen!
,
May 24 2016
Another instance of mickey running Stratosmedia kiosk App unexpectedly rebooting with M51-Beta build 8172.17.0, 51.0.2704.30
,
Jun 4 2016
I'm not actively investigating this issue at the moment, so moving back to "Assigned".
,
Jun 14 2016
I believe I am seeing a similar issue in an app I've created that slowly builds up it's memory usage until it finally runs out of memory. I am using an indexeddb and making many calls in succession to set image src tags in my html file. Within a 15 to 30 second timeframe the app could be setting up to 20 different image tags. The tags are all set in the success event of the get call to the database. The image tags are not created dynamically in the DOM. I have one Chromebox that has 4GBs of ram and the app has gotten as high as 1.9GBs of memory usage before crashing. On other boxes with only 2GBs of ram the memory threshold seems to be 700-750mbs of memory usage before crashing. I see the same memory usage issue when running the app on a Mac but it doesn't crash due to the 16GBs of ram installed. Let me know if I cangive you any more info.
,
Jun 15 2016
Passing along to erikchen@ for further triage. The cause of the OOM killing of the renderer has already been identified as in comments #57, #58, and #71.
,
Jun 20 2016
afakhry: What version of chrome were you using for #71? The stack that you posted should never occur in Chrome Canary.
,
Jun 20 2016
It was ToT at that time. Probably M52. Sorry, I didn't record the exact version.
,
Jun 20 2016
Oh, this is ChromeOS. I don't know anything about that platform. achuith: Can you find an appropriate owner?
,
Jun 20 2016
Jenny maybe?
,
Jun 23 2016
I've been testing the indexeddb OOM issue in version 44 of ChromeOS and the leaking I was seeing in versions 50 and 51 is not there at all. My app maintains the right amount of memory no matter how long it is running.
,
Jun 24 2016
Per #71, it looks like some shared memory files are created but not cleaned and it filled up the /tmp/ during the streaming. It is most likely not chromeos/kiosk specific issue, but a more general chrome issue with webrtc? It exposed in kiosk mode for Rise Player since it got the chance to continuously play 24x7. I pinged erikchen, but he said he is not the right person for webrtc code. Albert, do you know who we can ping in the webrtc team for such issue?
,
Jun 24 2016
tommi@ or brettw@?
,
Jun 24 2016
Could this be related to crbug.com/623175? As suspected by Erik?
,
Jun 24 2016
Seems possible. I see that code was last touched about a year ago which would explain comment #82 that this wasn't an issue in M44. BUT looking at the diff I see that the logic was similar even before that though. I suppose the refactor could have regressed something maybe?
,
Jun 27 2016
crbug.com/623175 points at a possible improvement to the implementation of Posix shared memory, now that macOS uses a different implementation. The implementation for POSIX has effectively remained unchanged for a couple of years. If this is a recent regression, you'll want to look somewhere else.
,
Jul 7 2016
We have seen the issue as well with the latest stable version of ChromeOS on Chromeboxes and it's killing our business. We have a chrome app with a start page which is web links which link in websites using the web view. One of the links goes to Sears.com and as you shop and add things to the cart the mem usage goes up and up. Sometime some drops out and more is available as you browse different items and add items to the cart. Prolonged use of maybe 30 min to 1 hr of shopping you end up with the memory full and in worst case situation it will crash to a screen showing a black screen with a frown. It causes you to turn off the Chromebox and restart it. In better cases it reloads the webview but that means if a customer has items in the cart they are now gone and you have to start from scratch. This is frustrating at best. Some stores are reporting they can't get through the checkout process because it dumps out the memory before the credit card screen. We are using 2gb Asus chromeboxes. We've been using them for about 1 1/2 years and it's been rock solid till just the last month and 1/2. So while 2gb is not much memory it's worked for 1 1/2 years and something has changed. If we run on double the ram 4gb the crashes go away but there are 900 of these boxes in stores. It's hard to say all of a sudden I have to replace them.
,
Jul 7 2016
FYI here are the system logs for a test app I've created that crashed due to out of memory issues. It's a new app so I do not know if it worked in older OS versions but it exhibits similar behavior to #88. It runs for a couple of hours and then crashes to a black screen with the frown face. Note this is simple HTML5 web app and is not playing back video, but is rendering a rotating 3d earth.
,
Jul 7 2016
#89, the log is not very useful. Could you check whether your device has a lot of /tmp/.com.google.Chrome.* files when OOM happens?
,
Jul 7 2016
I can take this one back now.
,
Jul 7 2016
,
Jul 7 2016
Issue 333996 has similar symptoms (leaking fd, but instead of OOM, we run out of disk space).
,
Jul 11 2016
It looks like that the /tmp files are not leaked from SharedMemory. Instead, I think that we are seeing issue 475444 . The problem is with xhr keeps a temp file when fetching a 'blob' and those files are kept around until the renderer is closed. The sample page in #16 of issue 475444 can repro the problem in a tab. The backing /tmp files for a blob is kept around until the tab is closed. +dmurph, could you help to triage this or issue 475444 ?
,
Jul 19 2016
Issue 590975 tracks v8 work to trigger GC on memory pressure and would help with this issue.
,
Jul 19 2016
the GC cleans temp files? I've never heard of that.
,
Jul 19 2016
Let me expand my comment in #94: The blob response of xhr is handled by RedirectToFileResourceHandler which uses a temp file to hold the downloaded bytes [1] and passes that to renderer as a blob. The blob holds reference to the temp file and stored as a member of xhr [2]. The temp file will be kept around until m_responseBlob of xhr is released (which happens when xhr is gone). So if GC does not reclaim a blob xhr, we have the temp file kept around indefinitely. [1]: https://cs.chromium.org/chromium/src/content/browser/loader/resource_dispatcher_host_impl.cc?rcl=1468927100&l=1600-1603 [2]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/xmlhttprequest/XMLHttpRequest.cpp?rcl=1468927100&l=1437
,
Nov 11 2016
,
Nov 11 2016
,
Feb 16 2018
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by scunning...@chromium.org
, Apr 12 2016Components: UI>Shell>Kiosk
Labels: OS-Chrome