autotest: reboot machine when /tmp become full |
|
Issue description
A disk qual failed because /tmp became full but the machine was not rebooted.
chrome-os-partner:48483
gs://chromeos-moblab-wistron/results/00:50:b6:59:4b:ff/a32f17fa-83b4-11e6-9efc-0050b6594bff/199-moblab
10/09 19:04:23.544 WARNI| abstract_ssh:0443| trying scp, rsync failed: Command <rsync -L --timeout=1800 --rsh='/usr/bin/ssh -a -x -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o Serve>
* Command:.
rsync -L --timeout=1800 --rsh='/usr/bin/ssh -a -x -o
StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes
-o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3
-o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g
"/tmp/tmplcnVmZ" "root@192.168.231.110:"/tmp/sysinfo/autoserv-
r2GqUc/global_config.ini""
Exit status: 11
Duration: 0.145009994507
stderr:
rsync: write failed on "/tmp/sysinfo/autoserv-r2GqUc/global_config.ini": No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(393) [receiver=3.1.1]
10/09 19:04:23.548 DEBUG| abstract_ssh:0446| Trying scp.
...
As part of the cleanup process, when a test fails because scp to DUT /tmp fails, should we try to reboot it?
,
Oct 26 2016
I believe the issue here is /tmp filled up _on the DUT_. And, IIUC, that means the fix is to add a verifier that checks "is there space in /tmp?". When it fails, it should trigger reboot as a repair action. gwendal@ - can you confirm?
,
Oct 26 2016
I note also that we already have a need for a common verifier that checks for various "out of space" conditions; see bug 596131. If we're going to address this problem with a verifier; we should make sure to fix that bug at the same time. |
|
►
Sign in to add a comment |
|
Comment 1 by sbasi@chromium.org
, Oct 26 2016