leave less garbage in /tmp on chromeos4-devserver5 (possibly other devservers too) |
|||||||||||||||
Issue descriptionOn chromeos4-devserver5 there are over 58,000 files in /tmp. Are we keeping those files for a reason? Most of the /tmp entries are directories and look like these (plain "ls" takes a bit over 1 second, even when cached). ... cros-update01PfYV cros-update01st8N cros-update01zJS1 cros-update0249he cros-update02Awj5 cros-update02d3Ci ... They go as far back as September 2, which is the time of the last reboot.
,
Oct 13 2016
Adding folks randomly.
,
Oct 13 2016
I'm going to guess that these are products of the new provision code not cleaning up after itself.
,
Oct 13 2016
I was assuming that by filing bugs under this category someone would see them and triage them, but maybe I was wrong? Should I just assign them randomly in the future?
,
Oct 14 2016
These come from provision jobs. Will create a job to delete them regularly.
,
Oct 14 2016
Is it useful to leave them around for a while, or could they be removed at the end of the task that created them?
,
Oct 14 2016
It's designed to not to be deleted directly after a finished provision task, since I thought one may check these logs if the logs are not properly transferred to shard/drone. However, seems RPC 'collect_au_log' is very stable and never fail. So after offline talk with Richard, I will delete it directly after provision task is finished.
,
Oct 14 2016
Well, you have to wait to delete it until after the collect_au_log RPC is called, right?
,
Oct 14 2016
,
Oct 14 2016
R#8, right
,
Oct 14 2016
Issue 652200 has been merged into this issue.
,
Oct 14 2016
,
Oct 17 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/dev-util/+/1bbfaba3a6b3487fd8edc000e761f155b2c0a665 commit 1bbfaba3a6b3487fd8edc000e761f155b2c0a665 Author: xixuan <xixuan@chromium.org> Date: Fri Oct 14 00:53:22 2016 Devserver: delete execute_log file for provision. Previously, execute_log for provision is preserved in devserver for possible future investigating. However, experience shows that they're barely checked. This CL deletes the provision execute_log after it's transferred back to shard/drone. BUG= chromium:654953 TEST=Run repair in local autotest with local devserver, to check whether the file is transferred back and also deleted in /tmp/. Change-Id: I62c6b1371eba5ca9b11c716ec1fcab111ce93efa Reviewed-on: https://chromium-review.googlesource.com/398423 Commit-Ready: Xixuan Wu <xixuan@chromium.org> Tested-by: Xixuan Wu <xixuan@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [modify] https://crrev.com/1bbfaba3a6b3487fd8edc000e761f155b2c0a665/cros_update_progress.py [modify] https://crrev.com/1bbfaba3a6b3487fd8edc000e761f155b2c0a665/devserver.py
,
Oct 19 2016
Update: another CL is prepared to avoid leaving garbage. Also a script is running now to delete these garbages older than 2 days ago.
,
Oct 25 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/platform/dev-util/+/3bc974ea3c230299b95afe764c37a87e6bab071e commit 3bc974ea3c230299b95afe764c37a87e6bab071e Author: xixuan <xixuan@chromium.org> Date: Wed Oct 19 00:21:43 2016 devserver: remove temp directory for storing devserver codes. Currently, when devserver tries to transfer devserver package, it first copies the codes without some unneccesary files to a temp directory, then transfer the whole package to device. This procedure will leave a temp directory on devserver and won't be deleted after the provision succeeds or fails. This CL helps the devserver to pass the temp directory to the auto_updater, and then delete the directory after provision is finished. BUG= chromium:654953 TEST=run repair with local autotest and devserver. Change-Id: I4d0bd4516923a3bd41c455175ca36093e24266c1 Reviewed-on: https://chromium-review.googlesource.com/399989 Commit-Ready: Xixuan Wu <xixuan@chromium.org> Tested-by: Xixuan Wu <xixuan@chromium.org> Reviewed-by: Allen Li <ayatane@chromium.org> [modify] https://crrev.com/3bc974ea3c230299b95afe764c37a87e6bab071e/cros_update_progress.py [modify] https://crrev.com/3bc974ea3c230299b95afe764c37a87e6bab071e/cros_update.py [modify] https://crrev.com/3bc974ea3c230299b95afe764c37a87e6bab071e/devserver.py
,
Nov 11 2016
Please also see related bug 664360 .
,
Dec 2 2016
fixed?
,
Dec 2 2016
For this particular bug, yes. For long-term, make a cron-job to clean files which are not removed by random reasons, no. We can close this for now, and track the long-term goal in bug 664360 .
,
Dec 20 2016
I don't think this was fixed as of Dec 7. chromeos2-devserver7 was still showing the same pattern of steady increase in the number of processes, and of disk space use. You can check on viceroy/chromeos. Unfortunately that devserver hardware failed (maybe overheated? :) so we can't check now. If it was fixed but the fix was not pushed to that devserver, feel free to close again, but it may be good to check the other devservers. Thanks!
,
Dec 20 2016
Issue 664360 has been merged into this issue.
,
Dec 20 2016
In fact I just checked chromeos4-devserver5 and it took 2 minutes and 12 seconds to run "ls /tmp". There are 36,000 entries. The /tmp/cros-updateXXX are about 1.5MB each. Many of them are from October. The total size of /tmp is not that big, but the number of entries could be a problem. readdir(2) could be blocking, also directory operations (adding or removing a file) take linear time.
,
Jan 3 2017
https://chrome-internal-review.googlesource.com/#/c/310135/ has cleared /tmp/ of every devserver every 12 hours, but one of the devserver still has 36,000 entries in /tmp and most of them are from October?! I can't check any devserver due to network restriction, but it's unexpected.
,
Jan 19 2017
,
Jan 31 2017
It's found that chromeos4-devserver5 has a not up-to-date chromeos-admin, which make puppet fail to update the newest setting to this devserver. Also this server has some wrong settings in its chromiumos repo, which blocks 'repo sync' in it, and as a result this server hasn't been updated from december. Now these two issues are manually fixed. We already have a new 'sync_and_run_puppet' cron_job to update chromeos_admin every 4 hours, which will ignore any local changes. So this won't be a problem any more. Another suspicious devserver chromeos2-devserver7 is offline. Mark this bug as fixed. Feel free to reopen it if you find more devservers have crashes in its /tmp/.
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
|
|||||||||||||||
►
Sign in to add a comment |
|||||||||||||||
Comment 1 by semenzato@chromium.org
, Oct 12 2016