deploy chrome w/--mount-dir leads to dbus config desync at next boot |
|||||
Issue descriptionMike mentioned on https://chromium-review.googlesource.com/c/chromium/src/+/662088 that he was seeing Chrome crash on startup with this error: [25718:25718:0920/132305.061223:FATAL:cros_dbus_service.cc(76)] Failed to own: org.chromium.VirtualFileRequestService He'd used deploy_chrome to push a ToT build (and /opt/google/chrome/dbus/org.chromium.VirtualFileRequestService.conf) to his dev system, and deploy_chrome had (correctly) rebooted the device to get dbus-daemon pick up the updated permissions. There have been many other reports of the same problem when other services were added ( issue 767107 , issue 766876 , issue 751267, issue 707317 , issue 732876 ). I SSH-ed in and verified that org.chromium.VirtualFileRequestService.conf contained the expected data: <!DOCTYPE busconfig PUBLIC "-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN" "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd"> <!-- Copyright 2017 The Chromium Authors. All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file. --> <busconfig> <policy user="chronos"> <allow own="org.chromium.VirtualFileRequestService"/> </policy> <policy user="virtual-file-provider"> <allow send_destination="org.chromium.VirtualFileRequestService" send_interface="org.chromium.VirtualFileRequestService"/> </policy> </busconfig I was also able to repro the problem using python-dbus as the chronos user: >>> import dbus >>> import dbus.service >>> name = dbus.service.BusName('org.chromium.DisplayService', dbus.SystemBus()) >>> name = dbus.service.BusName('org.chromium.VirtualFileRequestService', dbus.SystemBus()) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib64/python2.7/site-packages/dbus/service.py", line 131, in __new__ retval = bus.request_name(name, name_flags) File "/usr/local/lib64/python2.7/site-packages/dbus/bus.py", line 303, in request_name 'su', (name, flags)) File "/usr/local/lib64/python2.7/site-packages/dbus/connection.py", line 651, in call_blocking message, timeout) dbus.exceptions.DBusException: org.freedesktop.DBus.Error.AccessDenied: Connection ":1.356" is not allowed to own the service "org.chromium.VirtualFileRequestService" due to security policies in the configuration file In other words, I was able to take ownership of org.chromium.DisplayService (a name permitted by a file in /opt/google/chrome/dbus that had been around for a while, but not of org.chromium.VirtualFileRequestService (a newer file that had just been added). Bizarrely, after running "killall -HUP dbus-daemon" as root, the call to own VirtualFileRequestService succeeded (without even needing to reconnect to the bus). I suspect that either: a) dbus-daemon is caching some part of its configs across reboots, but reloads it on SIGHUP, or b) dbus-daemon is ignoring the new config at boot for some reason but accepting it on SIGHUP. The dbus-daemon man page says this: SIGHUP will cause the D-Bus daemon to PARTIALLY reload its configura‐ tion file and to flush its user/group information caches. Some configu‐ ration changes would require kicking all apps off the bus; so they will only take effect if you restart the daemon. Policy changes should take effect with SIGHUP. Regarding b), I strace-ed dbus-daemon while sending another SIGHUP to it and saw it writing a bunch of errors about missing users, including the virtual-file-provider user referenced in the new config: ... open("/opt/google/chrome/dbus/org.chromium.VirtualFileRequestService.conf", O_RDONLY) = 36 fstat(36, {st_mode=S_IFREG|0644, st_size=629, ...}) = 0 read(36, "<!DOCTYPE busconfig PUBLIC \"-//f"..., 629) = 629 close(36) = 0 open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 36 fstat(36, {st_mode=S_IFREG|0644, st_size=3831, ...}) = 0 read(36, "messagebus:!:201:201:dbus-daemon"..., 4096) = 3831 read(36, "", 4096) = 0 close(36) = 0 write(2, "Unknown username \"virtual-file-p"..., 75) = 75 ... It appears to just write these messages to stderr; I suspect we drop them. :-/ Adding <syslog/> to our config may help there. I'll spend some more time looking at the dbus-daemon source to try to figure out what's going on.
,
Sep 20 2017
Is it possible that virtual-file-provider is being created at boot but after dbus starts? Does restarting twice fix this issue?
,
Sep 20 2017
Is there a workaround for right now until you fix this issue?
,
Sep 20 2017
#2: /etc/passwd is on the readonly partition, so no, I don't think so. Not sure about restarting twice. #3: Try "killall -HUP dbus-daemon" as root, as described above. You should be able to SSH to your device as root ~30 seconds after it gives up on the UI job. (First time through, it may reboot automatically.)
,
Sep 20 2017
derat@: When do I run that command? Right before I run deploy_chrome?
,
Sep 20 2017
I tried to repro this locally by doing the following: a) Comment out the "<allow own="org.chromium.KioskAppService"/>" line in /opt/google/chrome/dbus/org.chromium.KioskAppService.conf. b) Reboot; Chrome aborts on startup as expected. c) Uncomment the line and reboot. Chrome happily started up, though, implying that dbus-daemon loaded the updated config.
,
Sep 20 2017
#5: After you run deploy_chrome and see the problem. (Run it on the Chrome OS device.)
,
Sep 20 2017
Issue 767107 has been merged into this issue.
,
Sep 20 2017
If SIGHUP works for others, I'm interested in knowing that! I'm still poking around in the dbus-daemon source, but as a quick fix, maybe we could update deploy_chrome to send SIGHUP instead of or in addition to rebooting.
,
Sep 21 2017
I'm not at all familiar with the dbus-daemon codebase, but here are some initial notes (https://cgit.freedesktop.org/dbus/dbus/, tag dbus-1.10.12): main calls bus_context_new, which calls (among other things) bus_config_load, process_config_first_time_only, process_config_every_time, and process_config_postinit. SIGHUP (and also ReloadConfig D-Bus calls) call bus_context_reload_config, which calls _dbus_flush_caches, bus_config_load, process_config_every_time, and process_config_postinit. I don't see anything interesting in process_config_first_time_only. _dbus_flush_caches is maybe marginally more interesting, but it looks like it only clears some in-memory hash tables that are presumably empty when the process starts. I'm not aware of any data that's cached on-disk. It looks like dbus-daemon also contains support for using inotify to watch config files and/or dirs. That may be complicating things, but I don't know if we're using it, or whether it's used to reload policies vs. activation files or something else. If someone finds a way to reliably reproduce this issue, it'd be good to figure out whether it was just a fluke that SIGHUP cleared this up on Mike's machine or if it's actually doing something different from rebooting. Maybe config files get reloaded in an arbitrary order each time, for instance. I don't see anything that could be blocking the chronos "allow own" directives if it's loaded after it, though. Our D-Bus configs are pretty ugly and it's quite easy to accidentally break unrelated services. I see a bunch of unnecessary deny directives, for instance. Resist the temptation to append --syslog-only to the dbus-daemon command in /etc/init/dbus.conf. I think that it was only added in a later version of D-Bus than what we're using, and if you add it dbus-daemon will fail to start and your system will reboot continually until you recover from USB.
,
Sep 21 2017
Just FYI, I hit the same exact error a second time after rebuilding and redeploying. It was just once amid several redeploys, and "killall -HUP dbus-daemon" helped.
,
Sep 21 2017
If/when you see it again, and if it's still easy for you to test, would you mind doing a bit more experimentation to check how reliable the SIGHUP is for fixing it? For example, after it first shows up: a) Does the problem persist even after you manually reboot a few times? b) If you're able to fix it with SIGHUP, does it stay fixed if you reboot again after that? I still have no idea why SIGHUP would fix things if rebooting doesn't. :-/
,
Sep 21 2017
I may have had to hard reset once as well, when the ui wouldn't start... I'll definitely ping you and try to answer those qs if I see it again.
,
Sep 21 2017
IIUC dbus-deamon is already outputting the log to /var/log/messages like this: 2017-09-21T17:15:26.918257+09:00 NOTICE dbus[565]: [system] Reloaded configuration The SIGHUP quick fix suggested in comment #9 sounds good, provided this issue only affects developers who use deploy_chrome. Otherwise, we can add code to deploy_chrome which detects if chrome is not running after reboot, and print a message which encourages the user to paste relevant logs to this bug.
,
Sep 21 2017
#14: It logs that message, but I believe it doesn't log much else. The NEWS file says that dbus-daemon was fixed to send warnings to syslog in 1.11.6; we're on 1.10.12. I'll spend some more time trying to repro this today.
,
Sep 22 2017
FWIW dbus-daemon's busconfig supports <syslog> element, and /usr/share/dbus-1/system.conf already includes it. https://dbus.freedesktop.org/doc/dbus-daemon.1.html Also, you can avoid bricking your device by removing "reboot" from the post-stop part of /etc/init/dbus.conf when tweaking dbus-daemon's config.
,
Sep 25 2017
Mike just mentioned that he's been passing "--target-dir=/usr/local/chrome --mount-dir=/opt/google/chrome" to deploy_chrome, which might explain the problem. deploy_chrome displays a warning in that case: https://cs.corp.google.com/chromeos_public/chromite/scripts/deploy_chrome.py?q=%22Detected+change+in+D-Bus+service+files%22&sq=package:%5Echromeos_(internal%7Cpublic)$&l=287 "WARNING: Detected change in D-Bus service files, but target dir is not /opt/google/chrome. D-Bus changes will not be picked up by dbus-daemon at boot time." dbus-daemon is only configured to look in /opt/google/chrome/dbus, so it's not going to pick up new configs after rebooting if they're deployed elsewhere. I didn't think to check if /opt/google/chrome was bind-mounted when I investigated his system, but if it was, that'd probably explain why the files looked older than the boot time but still hadn't been loaded. Maybe we should make deploy_chrome always copy configs to the real /opt/google/chrome/dbus directory, even if it's going to bind-mount over it later. We could make it reboot the system like normal in that case, although the developer may need to still run deploy_chrome again to redo the mount.
,
Sep 25 2017
Even simpler, maybe we can just make deploy_chrome send SIGHUP to dbus-daemon instead of rebooting. Assuming that dbus-daemon reopens /opt/google/chrome/dbus, that'd hopefully be enough to get it to pick up the new configs.
,
Sep 25 2017
,
Sep 26 2017
probably should do both since they shouldn't be too hard to pull
,
Sep 29 2017
Uploaded https://crrev.com/c/691356 to switch to SIGHUP. I think we actually don't want to overwrite the original files from /opt/google/chrome/dbus when we're using a bind mount, since that would cause issues after rebooting if the new, deployed policy is more restrictive than the old one.
,
Sep 29 2017
now that you say that out loud, that makes a lot of sense
,
Sep 29 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/0fce5bf64aadeadac2d66b4cb64c81176c020fa3 commit 0fce5bf64aadeadac2d66b4cb64c81176c020fa3 Author: Daniel Erat <derat@chromium.org> Date: Fri Sep 29 06:26:27 2017 chromite: Make deploy_chrome send SIGHUP to dbus-daemon. Send SIGHUP to dbus-daemon to tell it to reload the policies from /opt/google/chrome/dbus. This handles the case where a bind mount has been used to deploy an unstripped Chrome binary, and it also falls back gracefully to the original policies when the system is later rebooted. Also removed the previous code to reboot the system when policies change. BUG= chromium:767216 TEST=added a new service name to one of chrome's d-bus policy files, ran deploy_chrome.py with --target-dir and --mount-dir, and checked that i was immediately able to take ownership of the name as chronos using python-dbus. rebooted and was no longer able to take ownership of it Change-Id: Iaf0ccbd79c27d9d607a2cd980dc4f13c6fec4acd Reviewed-on: https://chromium-review.googlesource.com/691356 Commit-Ready: Dan Erat <derat@chromium.org> Tested-by: Dan Erat <derat@chromium.org> Reviewed-by: Justin TerAvest <teravest@chromium.org> [modify] https://crrev.com/0fce5bf64aadeadac2d66b4cb64c81176c020fa3/scripts/deploy_chrome.py
,
Sep 29 2017
chromite DEPS roll is at https://crrev.com/c/693318.
,
Oct 1 2017
|
|||||
►
Sign in to add a comment |
|||||
Comment 1 by derat@chromium.org
, Sep 20 2017