generic_RebootTest Flake |
|||
Issue descriptionA number of test runs are hitting flake in generic reboot test. Whats strange is this is the only test experiencing it. Will track them here. http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=99575830 https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/99575830-chromeos-test/chromeos6-row2-rack11-host7/debug 02/03 17:04:16.230 DEBUG| abstract_ssh:0744| Restarting master ssh connection 02/03 17:04:54.694 ERROR| metrics:0429| Caught exception while flushing: No module named pyasn1.codec.ber 02/03 17:05:05.483 WARNI| base_utils:0912| run process timeout (49) fired on: /usr/bin/ssh -a -x -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos6-row2-rack11-host7 "export LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\" \"server[stack::wait_up|is_up|ssh_ping] -> ssh_run(true)\";fi; true" 02/03 17:05:07.503 DEBUG| abstract_ssh:0599| Host chromeos6-row2-rack11-host7 is still down after waiting 343 seconds 02/03 17:05:07.504 INFO | server_job:0183| ABORT ---- reboot.verify timestamp=1486170307 localtime=Feb 03 17:05:07 Host did not return from reboot 02/03 17:05:07.506 INFO | server_job:0183| END FAIL ---- reboot timestamp=1486170307 localtime=Feb 03 17:05:07 Host did not return from reboot Traceback (most recent call last): File "/usr/local/autotest/server/server_job.py", line 937, in run_op op_func() File "/usr/local/autotest/server/hosts/remote.py", line 150, in reboot **dargs) File "/usr/local/autotest/server/hosts/remote.py", line 219, in wait_for_restart self.log_op(self.OP_REBOOT, op_func) File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 548, in log_op op_func() File "/usr/local/autotest/server/hosts/remote.py", line 218, in op_func super(RemoteHost, self).wait_for_restart(timeout=timeout, **dargs) File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 309, in wait_for_restart raise error.AutoservRebootError("Host did not return from reboot") AutoservRebootError: Host did not return from reboot What is interesting is this error "Caught exception while flushing: No module named pyasn1.codec.ber" I'll add more instances of this failure as I go through the failed CQ runs.
,
Mar 1 2017
this test keeps failing once in a while. please fix or move to bvt-perbuild. test flakes are killing the PFQ. this just happened here: https://uberchromegw.corp.google.com/i/chromeos/builders/daisy_skate-chrome-pfq/builds/3551/steps/HWTest%20%5Bbvt-cq%5D/logs/stdio and it has happened in other places too: https://bugs.chromium.org/p/chromium/issues/list?can=2&q=generic_RebootTest&colspec=ID+Pri+M+Stars+ReleaseBlock+Component+Status+Owner+Summary+OS+Modified&x=m&y=releaseblock&cells=ids
,
May 11 2017
,
Dec 15 2017
Another example: https://luci-milo.appspot.com/buildbot/chromeos/sentry-paladin/1794 12/15 04:47:12.283 DEBUG| abstract_ssh:0819| Restarting master ssh connection 12/15 04:47:12.284 DEBUG| ssh_multiplex:0118| Nuking ssh master_job 12/15 04:47:12.284 DEBUG| ssh_multiplex:0123| Cleaning ssh master_tempdir 12/15 04:47:12.284 INFO | ssh_multiplex:0092| Starting master ssh connection '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_mvottLssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row8-rack9-host4' 12/15 04:47:12.284 DEBUG| utils:0212| Running '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_mvottLssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos4-row8-rack9-host4' 12/15 04:47:42.441 INFO | ssh_multiplex:0107| Timed out waiting for master-ssh connection to be established. 12/15 04:48:01.525 WARNI| utils:0915| run process timeout (19) fired on: /usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_zZnkaBssh-master/socket -o Protocol=2 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -l root -p 22 chromeos4-row8-rack9-host4 "export LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\" \"server[stack::is_up|ssh_ping|run] -> ssh_run(true)\";fi; true" 12/15 04:48:03.538 DEBUG| abstract_ssh:0682| Host chromeos4-row8-rack9-host4 is still down after waiting 312 seconds 12/15 04:48:03.539 INFO | server_job:0218| ABORT ---- reboot.verify timestamp=1513342083 localtime=Dec 15 04:48:03 Host did not return from reboot 12/15 04:48:03.540 INFO | server_job:1401| Parsing lines in fast mode 12/15 04:48:03.541 INFO | server_job:0218| END FAIL ---- reboot timestamp=1513342083 localtime=Dec 15 04:48:03 Host did not return from reboot Traceback (most recent call last): File "/usr/local/autotest/server/server_job.py", line 1033, in run_op op_func() File "/usr/local/autotest/server/hosts/remote.py", line 160, in reboot **dargs) File "/usr/local/autotest/server/hosts/remote.py", line 229, in wait_for_restart self.log_op(self.OP_REBOOT, op_func) File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 566, in log_op op_func() File "/usr/local/autotest/server/hosts/remote.py", line 228, in op_func super(RemoteHost, self).wait_for_restart(timeout=timeout, **dargs) File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 310, in wait_for_restart raise error.AutoservRebootError("Host did not return from reboot") AutoservRebootError: Host did not return from reboot
,
Jul 24
|
|||
►
Sign in to add a comment |
|||
Comment 1 by akes...@chromium.org
, Feb 10 2017