New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 832376 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

investigate shill memory usage

Project Member Reported by semenzato@chromium.org, Apr 13 2018

Issue description

After removing chrome ("stop ui"), shill is the largest user of RAM.  From a caroline:

run "top", type shift-F, select MEM with arrow keys, press "s" for "sort by that field", then ESC:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
 1843 root      20   0  108504  80336   6120 S   0.0  2.0  22:35.96 shill       
 2142 devbrok+  20   0   33944   7004   3960 S   0.0  0.2   0:00.49 permission+ 
 1872 root      20   0  336136   5020   4472 S   0.0  0.1   0:04.69 cryptohomed 
30580 root      20   0   18332   4880   4276 S   0.0  0.1   0:00.03 sshd        

80MB of resident set size for shill seems a little high and worth investigating---maybe there is a low-hanging fruit.
 
On a newly-started shill process on samus, RSS is on the order of 11MB:

# grep ^Vm /proc/`pgrep shill`/status
VmPeak:    32572 kB
VmSize:    32516 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:     11004 kB
VmRSS:     11004 kB
VmData:     2660 kB
VmStk:       132 kB
VmExe:      4600 kB
VmLib:     12128 kB
VmPTE:        76 kB
VmSwap:        0 kB

localhost ~ # pmap `pgrep shill` | grep stack
00007fff87129000    132K rw---   [ stack ]
localhost ~ # pmap `pgrep shill` | grep anon
0000565628634000   2612K rw---   [ anon ]
0000763058acb000     16K rw---   [ anon ]
0000763058ad5000      4K rw---   [ anon ]
00007630592d9000     16K rw---   [ anon ]
000076305968d000     20K rw---   [ anon ]
00007630597a8000     12K rw---   [ anon ]
0000763059ac2000     16K rw---   [ anon ]
0000763059d31000      4K rw---   [ anon ]
0000763059f5f000      4K rw---   [ anon ]
0000763059f65000     12K rw---   [ anon ]
0000763059f6d000      8K rw---   [ anon ]
0000763059f78000      4K rw---   [ anon ]
0000763059f9b000      4K rw---   [ anon ]
000076305a012000      4K rw---   [ anon ]
000076305a148000     12K rw---   [ anon ]
000076305a15d000      4K rw---   [ anon ]
000076305a160000      4K rw---   [ anon ]
00007fff87178000      4K r-x--   [ anon ]
ffffffffff600000      4K r-x--   [ anon ]

If you restart shill, does the number drop dramatically?  Maybe we have a leak.

Most of the steady-state "churn" I see in net.log involves wifi services/endpoints being created and destroyed, so that might be the first thing to look at.  Could also be dbus or DHCP related.
Yes it looks like a leak.  My device had been on for several days, and on the morning after an afternoon restart the numbers are similar to yours.

top:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND    
 1886 root      20   0   38156  17404  12980 S   0.0  0.4   0:41.28 shill       
 2182 devbrok+  20   0   33932  12484   9144 S   0.0  0.3   0:00.13 permission+ 
 1887 root      20   0  336004  12188  10744 S   0.0  0.3   0:00.16 cryptohomed 
 2261 arc-oem+  20   0  179528  10936   9996 S   0.0  0.3   0:00.01 arc-oemcry+ 
 1682 chaps     20   0  178476  10548   9528 S   0.0  0.3   0:00.02 chapsd      

localhost ~ # grep ^Vm /proc/`pgrep shill`/status
VmPeak:	   38164 kB
VmSize:	   38156 kB
VmLck:	       0 kB
VmPin:	       0 kB
VmHWM:	   17404 kB
VmRSS:	   17404 kB
VmData:	    3920 kB
VmStk:	     132 kB
VmExe:	    4780 kB
VmLib:	   12260 kB
VmPTE:	      96 kB
VmSwap:	       0 kB

stack is the same

localhost ~ # pmap `pgrep shill` | grep anon 
000059e23bdeb000   3684K rw---   [ anon ]    <---maybe we're seeing some leakage already
00007fa8403a0000     16K rw---   [ anon ]
00007fa84074a000     20K rw---   [ anon ]
00007fa840a4d000     84K rw---   [ anon ]
00007fa840d79000     16K rw---   [ anon ]
00007fa840fb5000      4K rw---   [ anon ]
00007fa84104b000     32K rw---   [ anon ]
00007fa841062000     12K rw---   [ anon ]
00007fa841083000      4K rw---   [ anon ]
00007fa841089000      4K rw---   [ anon ]
00007fa84119d000      8K rw---   [ anon ]
00007fa8411a3000      4K rw---   [ anon ]
00007fa84124a000      4K rw---   [ anon ]
00007fa84124d000      4K rw---   [ anon ]
00007ffd049f0000      8K r----   [ anon ]
00007ffd049f2000      8K r-x--   [ anon ]

localhost ~ # restart shill
shill start/running, process 9829
localhost ~ # pmap `pgrep shill` | grep anon
000056e3a7975000   1928K rw---   [ anon ]    <--- suspiciously smaller
00007e7595e11000      8K rw---   [ anon ]
00007e7596a2c000     16K rw---   [ anon ]
00007e7596dd6000     20K rw---   [ anon ]
00007e75970d9000     84K rw---   [ anon ]
00007e7597405000     16K rw---   [ anon ]
00007e7597641000      4K rw---   [ anon ]
00007e75976d7000     32K rw---   [ anon ]
00007e75976ee000     12K rw---   [ anon ]
00007e759770f000      4K rw---   [ anon ]
00007e7597715000      4K rw---   [ anon ]
00007e7597829000      8K rw---   [ anon ]
00007e759782f000      4K rw---   [ anon ]
00007e75978d6000      4K rw---   [ anon ]
00007e75978d9000      4K rw---   [ anon ]
00007ffc9bfc1000      8K r----   [ anon ]
00007ffc9bfc3000      8K r-x--   [ anon ]




I saw the numbers drift up a little bit after `restart ui` and re-logon.  But when I started running it in a loop, the shill memory consumption eventually stabilized.  Probably something else.
Labels: Enterprise-Triaged

Sign in to add a comment