crosvm: silently uses 100% CPU while guest is idle |
|||
Issue descriptionEven when the guest is idling, crosvm is silently pegging an entire CPU on the host side. Experience says this is likely because a poll loop is trying to poll on a socket that was hungup. Other than the bloated CPU usage, there is no visible sign of this failure because the poll loop degenerates into busy polling without reporting anything to the log. The first step to smoking out these issues is to make this degenerate case make noise to the logs.
,
Mar 8 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/25c6bc137ecfd1f608993a3d2f24c5736c5c186a commit 25c6bc137ecfd1f608993a3d2f24c5736c5c186a Author: Zach Reizner <zachr@google.com> Date: Thu Mar 08 00:54:46 2018 sys_util: custom derive for PollToken Using an enum implementing PollToken is the recommended way to use PollContext, but writing the trait impls for each enum is mechanical yet error prone. This is a perfect candidate for a custom derive, which automates away the process using a simple derive attribute on an enum. BUG= chromium:816692 TEST=cargo test -p sys_util Change-Id: If21d0f94f9af4b4f6cef1f24c78fc36b50471053 Reviewed-on: https://chromium-review.googlesource.com/940865 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Chirantan Ekbote <chirantan@chromium.org> [add] https://crrev.com/25c6bc137ecfd1f608993a3d2f24c5736c5c186a/sys_util/poll_token_derive/tests.rs [modify] https://crrev.com/25c6bc137ecfd1f608993a3d2f24c5736c5c186a/sys_util/src/lib.rs [add] https://crrev.com/25c6bc137ecfd1f608993a3d2f24c5736c5c186a/sys_util/poll_token_derive/Cargo.toml [modify] https://crrev.com/25c6bc137ecfd1f608993a3d2f24c5736c5c186a/Cargo.lock [add] https://crrev.com/25c6bc137ecfd1f608993a3d2f24c5736c5c186a/sys_util/poll_token_derive/poll_token_derive.rs [modify] https://crrev.com/25c6bc137ecfd1f608993a3d2f24c5736c5c186a/sys_util/Cargo.toml [modify] https://crrev.com/25c6bc137ecfd1f608993a3d2f24c5736c5c186a/sys_util/src/poll.rs
,
Mar 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/d604dbbab4d3acbf9b3184e991c121505b517f5d commit d604dbbab4d3acbf9b3184e991c121505b517f5d Author: Zach Reizner <zachr@google.com> Date: Fri Mar 09 03:28:52 2018 crosvm/plugin: refactor poll loop to use PollContext This change simplifies plugin processing by removing the awkward run_until_started loop. This also switches to use PollContext instead of the Poller/Pollable interface, which required reallocating a Vec every loop to satisfy the borrow checker. TEST=cargo test --features plugin BUG= chromium:816692 Change-Id: Iedf26a32840a9a038205c4be8d1adb2f1b565a5c Reviewed-on: https://chromium-review.googlesource.com/938653 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Stephen Barber <smbarber@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/d604dbbab4d3acbf9b3184e991c121505b517f5d/src/plugin/process.rs [modify] https://crrev.com/d604dbbab4d3acbf9b3184e991c121505b517f5d/src/plugin/mod.rs [modify] https://crrev.com/d604dbbab4d3acbf9b3184e991c121505b517f5d/sys_util/src/signalfd.rs
,
Mar 24 2018
to be clear, this isn't resolved yet right ? i'm still seeing crosvm eat 100% of one cpu no matter what i do. (1) run top. see no crosvm running. (2) in new crosh, run `vmc start v`. (3) go back to top. see crosvm steady at ~25% cpu usage. (4) exit automatic vsh session. crosvm still at ~25% cpu usage. (5) run `vmc stop v`. see crosvm exit.
,
Mar 26 2018
I need to finish refactoring all the poll loops before we can be sure this isn't caused by socket hangups inducing busy waiting.
,
Mar 28 2018
An interesting wrinkle is that the old Poller interface specifically filtered out all non-POLLIN events meaning busy loops are all but assured to happen on POLLHUP: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/62a4063aa6c28d1f73e93fd0e7da2135d4d46d02/sys_util/src/poll.rs#143
,
Mar 30 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/1028f53ed2bcdc3088f73f59f268f3e99d5c06c9 commit 1028f53ed2bcdc3088f73f59f268f3e99d5c06c9 Author: Zach Reizner <zachr@google.com> Date: Fri Mar 30 04:59:45 2018 sys_util: have Poller return token on POLLHUP If POLLHUP is filtered out of the returned tokens, the caller of Poller::poll will likely just put the same (token, fd) in the next call to poll which will return instantly. This degrades into a busy poll loop without the chance for the caller to change the poll list. Instead, this change changes the filter to return tokens on POLLHUP so that the caller will hopefully notice the FD associated with the token has been hungup and will close it. BUG= chromium:816692 TEST=None Change-Id: Ie36d8a647a5fd7faabfd57a562205f75c77991e7 Reviewed-on: https://chromium-review.googlesource.com/985616 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Stephen Barber <smbarber@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/1028f53ed2bcdc3088f73f59f268f3e99d5c06c9/sys_util/src/poll.rs
,
Apr 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/f96be03cad1a133c24de3a4bc29f2df9161641c3 commit f96be03cad1a133c24de3a4bc29f2df9161641c3 Author: Zach Reizner <zachr@google.com> Date: Thu Apr 05 05:53:22 2018 devices: block: use PollContext in block device Switching to PollContext so that there is one less user of Poller, which will be removed. TEST=run any vm with a block device BUG= chromium:816692 Change-Id: I2e1301ea9d66012262f1fcb69eaeee9f7464f3b3 Reviewed-on: https://chromium-review.googlesource.com/983036 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Chirantan Ekbote <chirantan@chromium.org> [modify] https://crrev.com/f96be03cad1a133c24de3a4bc29f2df9161641c3/seccomp/aarch64/block_device.policy [modify] https://crrev.com/f96be03cad1a133c24de3a4bc29f2df9161641c3/seccomp/x86_64/block_device.policy [modify] https://crrev.com/f96be03cad1a133c24de3a4bc29f2df9161641c3/devices/src/virtio/block.rs
,
Apr 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/5bed0d2ffa79bc3f2542e11da5ad5cf134f73e96 commit 5bed0d2ffa79bc3f2542e11da5ad5cf134f73e96 Author: Zach Reizner <zachr@google.com> Date: Thu Apr 05 05:53:27 2018 crosvm/linux: switch to using PollContext in control loop This avoids the pitfalls of Poller, which required dynamic allocation on every loop for the dynamically added Pollables. Using PollContext also makes busy poll loops less silent. TEST=run a linux vm BUG= chromium:816692 Change-Id: If44e47bcbbd7c889399f957ad5bcca66eca57b8e Reviewed-on: https://chromium-review.googlesource.com/983038 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/5bed0d2ffa79bc3f2542e11da5ad5cf134f73e96/src/linux.rs
,
Apr 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/fc62c45dabfb72b5fd2e43831fd4caab40e61592 commit fc62c45dabfb72b5fd2e43831fd4caab40e61592 Author: Zach Reizner <zachr@google.com> Date: Thu Apr 05 22:20:42 2018 devices: use PollContext for all virtio deivces BUG= chromium:816692 TEST=run any VM Change-Id: I4219050fdb7947ca513f599f1ac57cde6052d397 Reviewed-on: https://chromium-review.googlesource.com/996917 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Stephen Barber <smbarber@chromium.org> [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/aarch64/rng_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/x86_64/balloon_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/aarch64/net_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/x86_64/rng_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/aarch64/vhost_net_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/devices/src/virtio/vhost/worker.rs [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/devices/src/virtio/vhost/mod.rs [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/x86_64/vhost_net_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/devices/src/virtio/net.rs [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/devices/src/virtio/rng.rs [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/devices/src/virtio/balloon.rs [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/aarch64/vhost_vsock_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/aarch64/balloon_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/x86_64/vhost_vsock_device.policy [modify] https://crrev.com/fc62c45dabfb72b5fd2e43831fd4caab40e61592/seccomp/x86_64/net_device.policy
,
Apr 7 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/c1b74eb8b1d123a940cabefc7be864cf33d74d00 commit c1b74eb8b1d123a940cabefc7be864cf33d74d00 Author: Zach Reizner <zachr@google.com> Date: Sat Apr 07 02:50:32 2018 sys_util: add method for copying PollEvents Making a copy of PollEvents is useful to drop the PollEvents structure which borrows from a PollContext. Even though immutably borrowing from a PollContext does not prevent any operations on a PollContext, it does prevent mutable method calls on any structure that owns PollContext. TEST=None BUG= chromium:816692 Change-Id: I9527fd5c122a703933deb973ad549b792226e4c6 Reviewed-on: https://chromium-review.googlesource.com/1000101 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/c1b74eb8b1d123a940cabefc7be864cf33d74d00/sys_util/src/poll.rs
,
Apr 7 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/d86e698ec800db139edee03a45140078850abfad commit d86e698ec800db139edee03a45140078850abfad Author: Zach Reizner <zachr@google.com> Date: Sat Apr 07 02:50:33 2018 devices: use nested PollContext in wayland device The wl device was the last user of the old Poller. BUG= chromium:816692 TEST=run wayland under crosvm Change-Id: I6c1c1db2774a6e783b7bd1109288328d75ad2223 Reviewed-on: https://chromium-review.googlesource.com/1000102 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/d86e698ec800db139edee03a45140078850abfad/devices/src/virtio/wl.rs [modify] https://crrev.com/d86e698ec800db139edee03a45140078850abfad/seccomp/x86_64/wl_device.policy
,
Apr 7 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/4fcd1af11ead83e11104e952a15582b6fc064d5b commit 4fcd1af11ead83e11104e952a15582b6fc064d5b Author: Zach Reizner <zachr@google.com> Date: Sat Apr 07 02:50:33 2018 sys_util: remove deprecated Poller/Pollable interface Now that there are no users of that interface, we should remove it. TEST=./build_test BUG= chromium:816692 Change-Id: Ifdbde22984f557b945e49559ba47076e99db923b Reviewed-on: https://chromium-review.googlesource.com/1000103 Commit-Ready: Zach Reizner <zachr@chromium.org> Tested-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/4fcd1af11ead83e11104e952a15582b6fc064d5b/sys_util/src/terminal.rs [modify] https://crrev.com/4fcd1af11ead83e11104e952a15582b6fc064d5b/sys_util/src/signalfd.rs [modify] https://crrev.com/4fcd1af11ead83e11104e952a15582b6fc064d5b/net_util/src/lib.rs [modify] https://crrev.com/4fcd1af11ead83e11104e952a15582b6fc064d5b/sys_util/src/eventfd.rs [modify] https://crrev.com/4fcd1af11ead83e11104e952a15582b6fc064d5b/sys_util/src/poll.rs
,
Apr 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/7a7268faf0a43c79b6a4520f5c2f35c3e0233932 commit 7a7268faf0a43c79b6a4520f5c2f35c3e0233932 Author: Sonny Rao <sonnyrao@chromium.org> Date: Thu Apr 12 01:08:32 2018 crosvm: aarch64: add epoll syscalls to seccomp policy for wayland Match the configuration for x86_64 BUG= chromium:816692 TEST=run wayland under crosvm on kevin Change-Id: If21bccddba362656fc02b213b9f30166f2c4be13 Reviewed-on: https://chromium-review.googlesource.com/1006488 Commit-Ready: Sonny Rao <sonnyrao@chromium.org> Tested-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Zach Reizner <zachr@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/7a7268faf0a43c79b6a4520f5c2f35c3e0233932/seccomp/aarch64/wl_device.policy
,
Apr 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/dafdbc01cbdd318180b6f28dbc6d05b60cf443b7 commit dafdbc01cbdd318180b6f28dbc6d05b60cf443b7 Author: Sonny Rao <sonnyrao@chromium.org> Date: Fri Apr 27 04:10:10 2018 crosvm: aarch64: fix seccomp entry for ftruncate on aarch64 Aarch64 seems to use ftruncate64 rather than ftruncate. BUG= chromium:816692 TEST=run VM on kevin using concierge Change-Id: I944f52d75fb9f5a3aaf5fe9e85708c48f249bb1a Reviewed-on: https://chromium-review.googlesource.com/1031175 Commit-Ready: Sonny Rao <sonnyrao@chromium.org> Tested-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Stephen Barber <smbarber@chromium.org> Reviewed-by: Dylan Reid <dgreid@chromium.org> [modify] https://crrev.com/dafdbc01cbdd318180b6f28dbc6d05b60cf443b7/seccomp/aarch64/block_device.policy
,
May 14 2018
<triage>zachr, could you give an update on this?</triage>
,
May 14 2018
,
May 15 2018
This should be fixed, as I haven't seen reports in a while. Re-open if somebody else observes this behavior. |
|||
►
Sign in to add a comment |
|||
Comment 1 by bugdroid1@chromium.org
, Mar 8 2018