The hard-coded value for min_filelist_kbytes in arc-stop-sysctl.conf is incorrect for some BOARDs |
||||||||
Issue descriptionWhat steps will reproduce the problem? (1) Flash R65-10188.0.0 to eve (2) Remove /etc/init/arc-*-sysctl.conf files from the device so that these scripts won't update min_filelist_kbytes. (3) Reboot the device. (4) ssh to the device, and run localhost ~ # sysctl vm.min_filelist_kbytes What is the expected result? vm.min_filelist_kbytes = 100000 What happens instead? vm.min_filelist_kbytes = 400000 Apparently, eve uses 400000 as the default/initial value for min_filelist_kbytes. However, because arc-stop-sysctl.conf unconditionally run 'sysctl vm.min_filelist_kbytes=100000' on stop-arc-instance, eve's min_filelist_kbytes becomes 100000 once ARC++ is shut down. It's probably time to stop hard-coding '100000' in arc-stop-sysctl.conf? (Not sure if this is a serious issue though. Opt-in users always execute arc-start-sysctl.conf on sign-in (and therefore it's set to 400000.) Opt-out users would execute neither arc-start-sysctl.conf nor arc-stop-sysctl.conf (therefore it remains in 400000.) The only ways I can think of to start the user's session with 100000 is: 1) boot, sign-in as an opt-in user, opt out from ARC++, sign out, sign in again, or 2) boot, sign in as an opt-in user, sign out, sign in as an opt-out user.)
,
Dec 5 2017
Yes, this can be a problem. (On platforms without ARC++ it's not a problem because we set min_filelist_kbytes to 100,000 by default.) So what's a good way of telling at boot if the user has previously opted out?
,
Dec 5 2017
I'm not sure if we can do anything at boot because 'sysctl vm.min_filelist_kbytes=100000' happens on stop-arc-instance (not on boot.) One idea to fix the issue is to consolidate arc-start-sysctl.conf and arc-stop-sysctl.conf into one, and remember the original value as an env value in the 'start' part of the script, and use the env in the 'stop' part later. Justin: what was the reason to have two separate Upstart scripts (-start- and -stop-) for this? I vaguely remember you explained the reason somewhere, but I can't recall.
,
Dec 5 2017
Yusuke: Here's the discussion that caused us to split this into two configs: https://chrome-internal-review.googlesource.com/c/chromeos/cheets-scripts/+/355569/2/arc-sysctl.conf#18 The main reason was to avoid delaying the "started" signal for start-arc-instance.
,
Dec 5 2017
#3 can we simply put the default value back to 100M, and rely on start-arc-instance + continue-arc-boot to change the value to 400M?
,
Dec 5 2017
comment #5 That seems like the best way to make things consistent. With that, when the container is running the value is 400k, otherwise it's 100k, without corner cases. (OT: BTW, now I fully understood what you meant at https://chrome-internal-review.googlesource.com/c/chromeos/autotest-cheets/+/521358/2/client/site_tests/cheets_SysctlTest/cheets_SysctlTest.py#19 . I wasn't aware of https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/540671 and didn't know that the default value right after boot was 400k on ARC-enabled devices. My auto test still detected issue 792152 because cheets_* auto tests brings up the container twice, not once, and runs the test only after the second container start.)
,
Dec 6 2017
Justin, I am giving you first pick. Let me know if you don't have time right now and I'll do it. I think it would be good to fix this ASAP, although a min_filelist_kbytes of 400MB without ARC++ is not as harmful as 100MB *with* ARC++ (which Yusuke has fixed).
,
Dec 6 2017
I'm a little squeezed at least tomorrow and probably the next day. I'd love if you could pick this up, but let me know if you can't, and I might be able to do something next week.
,
Dec 6 2017
Yes I should probably do this anyway as it involves swap.sh. It would be useful to have an answer to my question in #2 if possible: is there a simple way of telling at boot if ARC++ will or will not automatically start for the current session? And, similarly, is there a way of telling if ARC++ is running or not? The long story: swap.sh has a notion of "default value" for various memory manager settings. (Then there are board-specific and device-specific overrides, the former for board-tuning, the latter for experimenting. The overrides are fixed values, but that won't be a problem at least for now.) The default value is currently calculated by a function which returns 400M or 100M depending on whether the device is ARC++ enabled or not. I am wondering if I can change that function to reflect the actual status (i.e. enabled on board, but disabled by user) then this would be an easy fix.
,
Dec 6 2017
comment #9
> is there a simple way of telling at boot if ARC++ will or will not automatically start for the current session?
I don't think there's a way to do so. ARC++ opt-in status, which determines if the fully functional ARC++ container will run or not after sign-in, is stored as a *per-user* (and possibly policy-managed) Chrome preference. The preference is not per-device. Since the preference is in the user's encrypted home, Chrome OS cannot read it during OS boot (also, Chrome OS cannot determine which user's preference will be used at that point anyway.)
So.. can't we always do
min_filelist_default_generator() {
echo 100000
}
in swap.sh? Even with that, arc-start-sysctl.conf will reset it to 400000 when ARC++ is enabled in the user's Chrome preference.
> And, similarly, is there a way of telling if ARC++ is running or not?
During OS boot, ARC++ container is always not running. The very first moment the "mini" container (see below) starts is right after the login screen UI is shown by Chrome.
Some more details: the fully functional ARC++ container only starts after opted-in user's sign-in. Right after the login screen is shown, something called mini container always starts without checking any Chrome preference, but it's just a form of readahead, and mini container neither starts an app nor shows UI. I think we can just ignore mini container when discussing min_filelist_kbytes.
,
Dec 6 2017
#10 yes it all makes sense, thanks---in fact I should have realized it. The problem lies in how the code is organized. I would like the various overrides to apply to the ARC++ case, mainly because they are useful when experimenting. In reality two settings should be exposed: min_filelist_with_arc and min_filelist_without_arc. But I'd prefer not to add an extra setting as we have too many already. (The settings are exposed via crosh/debugd). Even the current settings are (we hope) a temporary solution until we figure out how automate this. I'll figure something out.
,
Dec 6 2017
OK so one more question. (Same question really.)
Is there a simple, reliable way of telling if ARC++ is running?
Because I could do this (btw, I wouldn't):
in arc-start-sysctl.conf:
script
touch /var/lib/swap/arc-is-running
KB="$(/usr/share/cros/init/swap.sh get_target_value min_filelist)"
sysctl vm.min_filelist_kbytes="${KB}"
end script
in arc-stop-sysctl.conf:
script
rm /var/lib/swap/arc-is-running
... same code ...
end script
in swap.conf:
min_filelist_default_generator() {
[ -e /var/lib/arc-is-running ] && echo 400000 || echo 100000
}
but it seems strange that I would need to maintain the arc-is-running state (which can easily become stale) instead of relying on some existing state.
But in any case I think this is the functionality I want.
Thanks!
,
Dec 6 2017
I think I came up with a better idea. In min_filelist_default_generator(), you could check arc-boot-continue's and arc-setup's job status: localhost init # status arc-boot-continue arc-boot-continue start/running localhost init # status arc-setup arc-setup stop/waiting If both are 'stop/waiting', ARC++ (full) container is not running. If either of two is 'start/running', it's running.
,
Dec 6 2017
Ah I like that! Let me update the CLs.
,
Dec 8 2017
The CLs are in the CQ, but just one quick thought. Using the file in /run may have been a useful indirection. Now if we change the arc start/stop signals we have to modify the upstart scripts AND swap.sh. Had we used the file, we would only have to modify the upstart scripts. Always tough choices eh?
,
Dec 8 2017
I understand the concern, but if we add the file, there will be one more interface between Chrome OS and ARC++ that the two teams need to maintain. ARC++ team is currently trying to simplify the interface (e.g. issue 792703 ), and adding a new file is kind of opposite to that. OTOH, arc-boot-continue is an existing job, and the current plan is to keep it for the foreseeable future. This is not a strong preference, but the number of interface between Chrome OS and ARC++ was the key reason why I called comment #13 "better".
,
Dec 9 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/chromeos/cheets-scripts/+/27e948cc8d227cb77679123ba9a91f20c9610707 commit 27e948cc8d227cb77679123ba9a91f20c9610707 Author: Luigi Semenzato <semenzato@chromium.org> Date: Sat Dec 09 04:37:24 2017
,
Dec 11 2017
#16 I was thinking that the agreed signal between ARC++ and Chrome OS would be the start/stop and status for those services. The file could be maintained as an internal state for the memory manager, so yes, it would be in /run/swap rather than /run/arc. But this is good enough.
,
Dec 11 2017
Quick summary for merge request: Without this fix, boards configured with ARC++ will boot with the larger min_filelist value (400 MB) even if ARC++ is disabled. (If the user logs out and back in, the value will be correctly set to 100 MB). The main consequence of this is wasted RAM, potentially up to 300 MB, which will be reserved for "file" pages (mainly code pages). The difference is large enough to be noticeable, hence the merge request.
,
Dec 11 2017
This bug requires manual review: Request affecting a post-stable build Please contact the milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), gkihumba@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Dec 11 2017
Please note that CL 811984 https://chromium-review.googlesource.com/#/c/chromiumos/overlays/chromiumos-overlay/+/811984/ also belongs to this bug. It was submitted with the wrong CL.
,
Dec 11 2017
Sounds like important fix to add in the next M63 refresh +Grace to evaluate/confirm merges
,
Dec 12 2017
Shall we push this then? It's on ToT since Dec. 9 so you should be able to observe the behavior in the canaries. Before Dec. 9 (but after Dec. 5, see issue 791605 ) if you disable ARC++, then reboot, ARC++ remains disabled, but min_filelist_kbytes (from crosh -> swap status) remains incorrectly at 400 MB. With this fix, at boot you should see 400 MB with ARC++ enabled, and 100 MB with ARC++ disabled. For additional verification, log out and log back in, and check that the value does not change. Then switch between a user that has ARC++ enabled and one that has it disabled, and check that the value changes acacordingly. For even further verification, re-enable ARC++ for a user, and check the value. Then reboot, log in, and check the value again. It should all work. Thanks!
,
Dec 12 2017
Your change meets the bar and is auto-approved for M64. Please go ahead and merge the CL to branch 3282 manually. Please contact milestone owner if you have questions. Owners: cmasso@(Android), cmasso@(iOS), kbleicher@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Feb 12 2018
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible! If all merges have been completed, please remove any remaining Merge-Approved labels from this issue. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Feb 16 2018
This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible! If all merges have been completed, please remove any remaining Merge-Approved labels from this issue. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by yusukes@chromium.org
, Dec 5 2017