Kernel Daily Regression Suite: platform_SyncCrash always fails on elm & veyron_jaq |
||
Issue descriptionThe platform_SyncCrash test, in kernel daily regressions is broken on elm & veyron_jaq. You can see this on https://wmatrix.googleplex.com/platform/kernel_daily?platforms=veyron_jaq&releases=56 https://wmatrix.googleplex.com/platform/kernel_daily?platforms=elm&releases=56 The summary says: Autotest client terminated unexpectedly: DUT is pingable, SSHable and did NOT restart un-expectedly. We probably lost connectivity during the test. reboot command failed This happens consistently. It needs to be fixed (or the test should be disabled for these boards).
,
Oct 24 2016
This is one of the log buckets https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/81886522-chromeos-test/chromeos2-row6-rack8-host19/debug/ Autoserv.DEBUG has this: 10/21 11:18:23.710 DEBUG| ssh_host:0180| Running (ssh) '/usr/local/opt/punybench/bin/testsync -f /usr/local/CrashDir/xyzzy -z 4096 -p verifier' 10/21 11:18:24.372 DEBUG| base_utils:0280| [stdout] 10/21 11:18:24.372 DEBUG| base_utils:0280| [stdout] /usr/local/opt/punybench/bin/testsync -f /usr/local/CrashDir/xyzzy -z 4096 -p verifier 10/21 11:18:24.452 DEBUG| ssh_host:0180| Running (ssh) 'rm -r /usr/local/CrashDir/' 10/21 11:18:25.119 INFO |platform_SyncCrash:0047| rm -r /usr/local/CrashDir/ 10/21 11:18:25.119 DEBUG| ssh_host:0180| Running (ssh) 'mkdir -p /usr/local/CrashDir/' 10/21 11:18:25.820 INFO |platform_SyncCrash:0047| mkdir -p /usr/local/CrashDir/ 10/21 11:18:25.821 INFO |platform_SyncCrash:0058| Crash: chromeos2-row6-rack8-host19 10/21 11:18:25.821 INFO |platform_SyncCrash:0061| Crash: /usr/local/opt/punybench/bin/testsync -f /usr/local/CrashDir/xyzzy -z 4096 -p mapper 10/21 11:18:25.821 DEBUG| ssh_host:0180| Running (ssh) 'cat /etc/lsb-release' 10/21 11:18:26.526 INFO | server_job:0153| START ---- reboot timestamp=1477073906 localtime=Oct 21 11:18:26 10/21 11:18:26.527 INFO | server_job:0153| GOOD ---- reboot.start timestamp=1477073906 localtime=Oct 21 11:18:26 10/21 11:18:26.527 DEBUG| ssh_host:0180| Running (ssh) 'if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi' 10/21 11:18:27.188 DEBUG| base_utils:0280| [stdout] af7157ee-3dc5-4ae4-af61-bee62ed3f1fa 10/21 11:18:27.264 DEBUG| ssh_host:0180| Running (ssh) '( /usr/local/opt/punybench/bin/testsync -f /usr/local/CrashDir/xyzzy -z 4096 -p mapper ) </dev/null >/dev/null 2>&1 & echo -n $!' 10/21 11:33:28.061 ERROR| base_utils:0280| [stderr] Write failed: Broken pipe 10/21 11:33:28.063 DEBUG| base_utils:0299| [stdout] 8994 10/21 11:33:28.065 INFO | server_job:0153| ABORT ---- reboot.start timestamp=1477074808 localtime=Oct 21 11:33:28 reboot command failed 10/21 11:33:28.069 INFO | server_job:0153| END FAIL ---- reboot timestamp=1477074808 localtime=Oct 21 11:33:28 command execution error So the problem seems to happen between 11:18 and 11:33. There is a platform_SyncCrash.tgz in there, I started looking at console-ramoops and /var/log/messages, but found nothing yet.
,
Oct 25 2016
In response to #1: It seems that someone in the kernel team should take a look at this. It could be a bad DUT, but we're the most qualified to tell. It's file system related so maybe Gwendal can take a look? |
||
►
Sign in to add a comment |
||
Comment 1 by sonnyrao@chromium.org
, Oct 24 2016