Implement balloon device policy |
|||||||||
Issue descriptionWe currently over-subscribe memory when handing it out to VMs. We should dynamically take memory away from VMs and give it back via the balloon driver in response to memory conditions on the host.
,
Jul 24
,
Jul 25
,
Jul 27
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/354a7aff3e225cdd867fbc5912841210e00cf2e1 commit 354a7aff3e225cdd867fbc5912841210e00cf2e1 Author: Chirantan Ekbote <chirantan@chromium.org> Date: Fri Jul 27 19:12:49 2018 vm_tools: concierge: Bind mount /dev/chromeos-low-mem Bind mount in /dev/chromeos-low-mem so that crosvm can use it to manage the size of the balloon device. BUG= chromium:866193 TEST=check that the device node is accessible inside concierge's mount namespace Change-Id: I4cd9c942d0835b8e544fa004858fd2beffe364f7 Signed-off-by: Chirantan Ekbote <chirantan@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1152214 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Reviewed-by: Sonny Rao <sonnyrao@chromium.org> [modify] https://crrev.com/354a7aff3e225cdd867fbc5912841210e00cf2e1/vm_tools/init/vm_concierge.conf
,
Jul 27
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5 commit 448516e3f985dd13fb5cd16f2c9efbcf097f9fa5 Author: Chirantan Ekbote <chirantan@chromium.org> Date: Fri Jul 27 22:29:07 2018 balloon: Implement device policy Implement a policy for the balloon device so that it starts taking memory away from the VM when the system is under low memory conditions. There are a few pieces here: * Change the madvise call in MemoryMapping::dont_need_range to use MADV_REMOVE instead of MADV_DONTNEED. The latter does nothing when the memory mapping is shared across multiple processes while the former immediately gives the pages in the specified range back to the kernel. Subsequent accesses to memory in that range returns zero pages. * Change the protocol between the balloon device process and the main crosvm process. Previously, the device process expected the main process to send it increments in the amount of memory consumed by the balloon device. Now, it instead just expects the absolute value of the memory that should be consumed. To properly implement the policy the main process needs to keep track of the total memory consumed by the balloon device so this makes it easier to handle all the policy in one place. * Add a policy for dealing with low memory situations. When the VM starts up, we determine the maximum amount of memory that the balloon device should consume: * If the VM has more than 1.5GB of memory, the balloon device max is the size of the VM memory minus 1GB. * Otherwise, if the VM has at least 500MB, the balloon device max is 50% of the size of the VM memory. * Otherwise, the max is 0. The increment used to change the size of the balloon is defined as 1/16 of the max memory that the balloon device will consume. When the crosvm main process detects that the system is low on memory, it immediately increases the balloon size by the increment (unless it has already reached the max). It then starts 2 timers: one to check for low memory conditions again in 1 seconds (+ jitter) and another to check if the system is no longer low on memory in 1 minute (+ jitter) with a subsequent interval of 30 seconds (+ jitter). Under persistent low memory conditions the balloon device will consume the maximum memory after 16 seconds. Once there is enough available memory the balloon size will shrink back down to 0 after at most 9 minutes. BUG= chromium:866193 TEST=manual Start 2 VMs and write out a large file (size > system RAM) in each. Observe /sys/kernel/mm/chromeos-low_mem/available and see that the available memory steadily decreases until it goes under the low memory margin at which point the available memory bounces back up as crosvm frees up pages. CQ-DEPEND=CL:1152214 Change-Id: I2046729683aa081c9d7ed039d902ad11737c1d52 Signed-off-by: Chirantan Ekbote <chirantan@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1149155 Reviewed-by: Sonny Rao <sonnyrao@chromium.org> [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/seccomp/x86_64/balloon_device.policy [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/src/main.rs [add] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/sys_util/src/timerfd.rs [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/sys_util/src/mmap.rs [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/src/linux.rs [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/sys_util/src/guest_memory.rs [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/sys_util/src/lib.rs [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/devices/src/virtio/balloon.rs [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/Cargo.toml [modify] https://crrev.com/448516e3f985dd13fb5cd16f2c9efbcf097f9fa5/seccomp/arm/balloon_device.policy
,
Jul 27
Merge requested for the changes in #4 and #5.
,
Jul 27
,
Jul 28
Your change meets the bar and is auto-approved for M69. Please go ahead and merge the CL to branch 3497 manually. Please contact milestone owner if you have questions. Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 30
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform2/+/df1d52d718ed8abd3e877902dcefc41c23dac410 commit df1d52d718ed8abd3e877902dcefc41c23dac410 Author: Chirantan Ekbote <chirantan@chromium.org> Date: Mon Jul 30 22:39:51 2018 vm_tools: concierge: Bind mount /dev/chromeos-low-mem Bind mount in /dev/chromeos-low-mem so that crosvm can use it to manage the size of the balloon device. BUG= chromium:866193 TEST=check that the device node is accessible inside concierge's mount namespace Change-Id: I4cd9c942d0835b8e544fa004858fd2beffe364f7 Signed-off-by: Chirantan Ekbote <chirantan@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1152214 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Reviewed-by: Sonny Rao <sonnyrao@chromium.org> (cherry picked from commit 354a7aff3e225cdd867fbc5912841210e00cf2e1) Reviewed-on: https://chromium-review.googlesource.com/1155431 [modify] https://crrev.com/df1d52d718ed8abd3e877902dcefc41c23dac410/vm_tools/init/vm_concierge.conf
,
Jul 30
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/crosvm/+/bae7078a825bb3c18ea770eb1c3135a230d0f270 commit bae7078a825bb3c18ea770eb1c3135a230d0f270 Author: Chirantan Ekbote <chirantan@chromium.org> Date: Mon Jul 30 22:47:40 2018 balloon: Implement device policy Implement a policy for the balloon device so that it starts taking memory away from the VM when the system is under low memory conditions. There are a few pieces here: * Change the madvise call in MemoryMapping::dont_need_range to use MADV_REMOVE instead of MADV_DONTNEED. The latter does nothing when the memory mapping is shared across multiple processes while the former immediately gives the pages in the specified range back to the kernel. Subsequent accesses to memory in that range returns zero pages. * Change the protocol between the balloon device process and the main crosvm process. Previously, the device process expected the main process to send it increments in the amount of memory consumed by the balloon device. Now, it instead just expects the absolute value of the memory that should be consumed. To properly implement the policy the main process needs to keep track of the total memory consumed by the balloon device so this makes it easier to handle all the policy in one place. * Add a policy for dealing with low memory situations. When the VM starts up, we determine the maximum amount of memory that the balloon device should consume: * If the VM has more than 1.5GB of memory, the balloon device max is the size of the VM memory minus 1GB. * Otherwise, if the VM has at least 500MB, the balloon device max is 50% of the size of the VM memory. * Otherwise, the max is 0. The increment used to change the size of the balloon is defined as 1/16 of the max memory that the balloon device will consume. When the crosvm main process detects that the system is low on memory, it immediately increases the balloon size by the increment (unless it has already reached the max). It then starts 2 timers: one to check for low memory conditions again in 1 seconds (+ jitter) and another to check if the system is no longer low on memory in 1 minute (+ jitter) with a subsequent interval of 30 seconds (+ jitter). Under persistent low memory conditions the balloon device will consume the maximum memory after 16 seconds. Once there is enough available memory the balloon size will shrink back down to 0 after at most 9 minutes. BUG= chromium:866193 TEST=manual Start 2 VMs and write out a large file (size > system RAM) in each. Observe /sys/kernel/mm/chromeos-low_mem/available and see that the available memory steadily decreases until it goes under the low memory margin at which point the available memory bounces back up as crosvm frees up pages. CQ-DEPEND=CL:1152214 Change-Id: I2046729683aa081c9d7ed039d902ad11737c1d52 Signed-off-by: Chirantan Ekbote <chirantan@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/1149155 Reviewed-by: Sonny Rao <sonnyrao@chromium.org> (cherry picked from commit 448516e3f985dd13fb5cd16f2c9efbcf097f9fa5) Reviewed-on: https://chromium-review.googlesource.com/1155822 [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/seccomp/x86_64/balloon_device.policy [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/src/main.rs [add] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/sys_util/src/timerfd.rs [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/sys_util/src/mmap.rs [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/src/linux.rs [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/sys_util/src/guest_memory.rs [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/sys_util/src/lib.rs [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/devices/src/virtio/balloon.rs [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/Cargo.toml [modify] https://crrev.com/bae7078a825bb3c18ea770eb1c3135a230d0f270/seccomp/aarch64/balloon_device.policy
,
Jul 30
,
Jul 30
Marking this fixed. We'll use a new bug to track improvements to the balloon policy. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by sonnyrao@chromium.org
, Jul 21