New issue
Advanced search Search tips

Issue 836350 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

/sbin/killers: stop trying to kill 'n/some/path/that/is/not/a/PID'

Project Member Reported by briannorris@chromium.org, Apr 24 2018

Issue description

/sbin/killers has this:

# Parses the output of `lsof -Fn` in the format (p<pid>\n(n<filename>\n)*)*,
# and returns the list of <pid>s followed by a non-empty list of filenames.
# This is an utility function for use in kill_open_file_on().
_lsofFn_filter() {
  # Save all the most recent 'p' line in the buffer.
  # When we get a matching 'n' line, pull the last 'p' line out
  # of the buffer and show it (w/out the leading 'p' char).
  sed -n -E -e '/^p/h' -e "\\%^n($1)%{x;s:^p::;p}" | sort -u
}


I have to admit, parsing the sed-foo is a bit tough, but I'm pretty sure it's not doing what it says. On an example system, I see:

# lsof -Fn /home/ /mnt/stateful_partition/ | sed -n -E -e '/^p/h' -e "\\%^n($1)%{x;s:^p::;p}" | sort -u
2238
2313
n/data/dalvik-cache/arm/system@framework@boot-apache-xml.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-bouncycastle.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-conscrypt.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-core-junit.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-core-libart.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-ext.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-framework.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-ims-common.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-okhttp.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-telephony-common.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot-voip-common.art (stat: No such file or directory)
n/data/dalvik-cache/arm/system@framework@boot.art (stat: No such file or directory)
n/usr/local/bin/python2.7
n/usr/local/lib/libpython2.7.so.1.0
n/usr/local/lib/python2.7/lib-dynload/_collections.so
n/usr/local/lib/python2.7/lib-dynload/_functools.so
n/usr/local/lib/python2.7/lib-dynload/_heapq.so
n/usr/local/lib/python2.7/lib-dynload/_socket.so
n/usr/local/lib/python2.7/lib-dynload/_ssl.so
n/usr/local/lib/python2.7/lib-dynload/_struct.so
n/usr/local/lib/python2.7/lib-dynload/binascii.so
n/usr/local/lib/python2.7/lib-dynload/cPickle.so
n/usr/local/lib/python2.7/lib-dynload/cStringIO.so
n/usr/local/lib/python2.7/lib-dynload/fcntl.so
n/usr/local/lib/python2.7/lib-dynload/itertools.so
n/usr/local/lib/python2.7/lib-dynload/operator.so
n/usr/local/lib/python2.7/lib-dynload/select.so
n/usr/local/lib/python2.7/lib-dynload/time.so

So if you actually watch the output of /sbin/chromeos_shutdown, you'll see error spew from trying to 'kill' path-like things like 'n/usr/local/...'.

I'm not sure if this has any real problem other than making stderr not very pretty. But it seems like something we should fix.
 
Sorry:

# lsof -Fn /home/ /mnt/stateful_partition/ | sed -n -E -e '/^p/h' -e "\\%^n($1)%{x;s:^p::;p}" | sort -u


the "$1" should be "." (the arg that we always pass to _lsofFn_filter()). But the result is the same.

Comment 2 by kinaba@chromium.org, Apr 25 2018

Owner: kinaba@chromium.org
Status: Assigned (was: Untriaged)
ack. thanks for noticing the bad behavior.

I guess the 'x' of {x;s:^p::;p} is suspicious... It swaps the hold space (p####) and the pattern space (n/path/) but probably we should rather copy since we only want to hold the pid line.

I'll give it a try today or tomorrow.

Comment 3 by kinaba@chromium.org, Apr 25 2018

Status: Started (was: Assigned)

Comment 4 by vapier@chromium.org, Apr 25 2018

i think _lsofFn_filter is doing what it's documented to do, it's just not what the callers want.  the comments explicitly say it returns both the pids and the files, but the callers only want pids.  so if the only thing we care about are pids here, it makes for a much easier sed script.
  sed -n '/^p/s:^p::p'

on a somewhat related note, shouldn't all these lsof calls be using -n ?

Comment 5 by kinaba@google.com, Apr 25 2018

> So if the only thing we care about are pids here

unfortunately, this is not the case.
we only want pids (p####) that are followed by at least one n/path that matches the pattern $1, other pids need to be discarded.

> on a somewhat related note, shouldn't all these lsof calls be using -n ?

lsof (-n) enumerates all the processes that is opening a file on the same filesystem as the argument path. Here we however want to restrict the attention to the processes opening files *under* the given path.

(to be more concrete; we don't want `kill_under /home` to kill processing opening /var)

Comment 6 by kinaba@google.com, Apr 25 2018

the comment is hard to parse and confusing. I'll update that too

Comment 7 by vapier@chromium.org, Apr 25 2018

-n disables network lookups and nothing else

Comment 8 by kinaba@google.com, Apr 25 2018

ah, right, I think I misunderstood your comment. Yeah, having -n sounds nice indeed.
> i think _lsofFn_filter is doing what it's documented to do,

Technically, it's not. (But you're closer to the mark than I was.)

"returns the list of <pid>s followed by a non-empty list of filenames"

The strings that being with 'n/...' are not really filenames. They're mangled filenames ;)

> it's just not what the callers want.

Definitely true!
Project Member

Comment 10 by bugdroid1@chromium.org, Apr 25 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/5f0d3034d38216c981758f06915796e43b15b195

commit 5f0d3034d38216c981758f06915796e43b15b195
Author: Kazuhiro Inaba <kinaba@chromium.org>
Date: Wed Apr 25 21:40:56 2018

init: Fix _lsofFn_filter's sed script.

'x' was to pull the buffer *and* set the current line to the buffer.
Hence, for instance, if the output of lsof was

------
p123
n/foo
n/bar
------

it was transformed to

------
123
n/foo
------

because 'x' while processing n/foo replaces the buffer p123 with n/foo.

The new implementation using 'g' (that just pulls the buffer data) will yield

------
123
123
------

which works fine.

BUG= chromium:836350 
TEST=`. /sbin/killers; kill_with_open_files_on /var` won't kill non-PIDs.

Change-Id: Ibc087aff28d96e370d314a58356282be712c6b80
Reviewed-on: https://chromium-review.googlesource.com/1027335
Commit-Ready: Kazuhiro Inaba <kinaba@chromium.org>
Tested-by: Kazuhiro Inaba <kinaba@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/5f0d3034d38216c981758f06915796e43b15b195/init/killers

Project Member

Comment 11 by bugdroid1@chromium.org, Apr 26 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/80136e5e5c07cc4124b1ad009aaa6833486a4634

commit 80136e5e5c07cc4124b1ad009aaa6833486a4634
Author: Kazuhiro Inaba <kinaba@chromium.org>
Date: Thu Apr 26 20:01:34 2018

init: A small fix on "Fix _lsofFn_filter's sed script."

A caller (ui-post-stop) passes an or-pattern like '/home|/data'.
Then ^n$1 is parsed as (^n/home)|(/data) which is not intended.

BUG= chromium:836350 
TEST=process opening /var/data is not killed by ui-post-stop

Change-Id: Id3d28168ae74832c2456d856405a93be299f06f8
Reviewed-on: https://chromium-review.googlesource.com/1029230
Commit-Ready: Kazuhiro Inaba <kinaba@chromium.org>
Tested-by: Kazuhiro Inaba <kinaba@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/80136e5e5c07cc4124b1ad009aaa6833486a4634/init/killers

Status: Fixed (was: Started)
Cc: hidehiko@chromium.org
+hidehiko who saw an issue around lazy unmounting.

Sign in to add a comment