cave-paladin: tests aborted with timeout and provision failed with "The host has wrong cros-version label" |
||||||
Issue descriptionBuilder: cave-paladin Build: https://chromegw.corp.google.com/i/chromeos/builders/cave-paladin/builds/529 Observation: It seems that dev_server failed to perform AU and the image rolled back to previous build, so the provision complain about cros-version. HWTest summary: 06-08-2017 [13:01:00] Start collecting test results and dump them to json. Suite job [ FAILED ] Suite job ABORT: cheets_CTS_N.7.1_r6.x86.CtsDramTestCases [ FAILED ] cheets_CTS_N.7.1_r6.x86.CtsDramTestCases ABORT: Timed out, did not run. cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases [ FAILED ] cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases ABORT: Timed out, did not run. provision_AutoUpdate.double [ FAILED ] provision_AutoUpdate.double ABORT: Timed out, did not run. security_ModuleLocking [ FAILED ] security_ModuleLocking ABORT: Timed out, did not run. login_UserPolicyKeys [ FAILED ] login_UserPolicyKeys ABORT: Timed out, did not run. security_ASLR [ FAILED ] security_ASLR ABORT: Timed out, did not run. login_OwnershipApi [ FAILED ] login_OwnershipApi ABORT: Timed out, did not run. login_RemoteOwnership [ FAILED ] login_RemoteOwnership ABORT: Timed out, did not run. security_ChromiumOSLSM [ FAILED ] security_ChromiumOSLSM ABORT: Timed out, did not run. logging_CrashSender [ FAILED ] logging_CrashSender ABORT: Timed out, did not run. desktopui_KillRestart [ FAILED ] desktopui_KillRestart ABORT: Timed out, did not run. build_RootFilesystemSize [ FAILED ] build_RootFilesystemSize ABORT: Timed out, did not run. security_AccountsBaseline [ FAILED ] security_AccountsBaseline ABORT: Timed out, did not run. security_SuidBinaries [ FAILED ] security_SuidBinaries ABORT: Timed out, did not run. security_SysLogPermissions [ FAILED ] security_SysLogPermissions ABORT: Timed out, did not run. security_RuntimeExecStack [ FAILED ] security_RuntimeExecStack ABORT: Timed out, did not run. security_OpenFDs [ FAILED ] security_OpenFDs ABORT: Timed out, did not run. security_EnableChromeTesting [ FAILED ] security_EnableChromeTesting ABORT: Timed out, did not run. security_SymlinkRestrictions [ FAILED ] security_SymlinkRestrictions ABORT: Timed out, did not run. provision [ FAILED ] provision FAIL: The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Unhandled AutoservError: No answer to ping from chromeos2-row8-rack7-host11 provision [ FAILED ] provision FAIL: The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Unhandled AutoservError: No answer to ping from chromeos2-row8-rack7-host18 Logs from https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122220560-chromeos-test/chromeos6-row2-rack18-host16/debug/: 06/08 11:37:41.284 DEBUG| dev_server:2132| Exception raised on auto_update attempt #1: Traceback (most recent call last): File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 257, in TriggerAU chromeos_AU.PostCheckCrOSUpdate() File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1241, in PostCheckCrOSUpdate 'build' % (self.update_version, self.device.hostname)) File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1021, in _VerifyBootExpectations raise RootfsUpdateError(rollback_message) RootfsUpdateError: Build cave-paladin/R61-9631.0.0-rc1 failed to boot on chromeos6-row2-rack18-host16; system rolled back to previous build 06/08 11:46:14.008 DEBUG| dev_server:2132| Exception raised on auto_update attempt #2: Traceback (most recent call last): File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 257, in TriggerAU chromeos_AU.PostCheckCrOSUpdate() File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1241, in PostCheckCrOSUpdate 'build' % (self.update_version, self.device.hostname)) File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1021, in _VerifyBootExpectations raise RootfsUpdateError(rollback_message) RootfsUpdateError: Build cave-paladin/R61-9631.0.0-rc1 failed to boot on 100.115.131.2; system rolled back to previous build 06/08 11:46:14.010 DEBUG| dev_server:2138| Please see error details in log /usr/local/autotest/results/hosts/chromeos6-row2-rack18-host16/627197-provision/20170806112713/autoupdate_logs/CrOS_update_100.115.131.2_31628.log 06/08 12:04:31.765 DEBUG| dev_server:2132| Exception raised on auto_update attempt #3: No JSON object could be decoded 06/08 12:04:31.767 DEBUG| dev_server:2138| Please see error details in log /usr/local/autotest/results/hosts/chromeos6-row2-rack18-host16/627197-provision/20170806112713/autoupdate_logs/CrOS_update_100.115.131.2_0.log
,
Jun 8 2017
I've gone ahead and contacted the Intel developers in charge of the DRM patches that were being landed in this run, and marked the first of their series as V-1. https://chromium-review.googlesource.com/421935 I've also grabbed the build and will see if it boots.
,
Jun 8 2017
,
Jun 8 2017
Indeed that build doesn't boot. But there's also no ramoops as well. I'm kicking it back to Intel to look into. Thanks.
,
Jun 8 2017
Builder: elm-paladin Build: https://chromegw.corp.google.com/i/chromeos/builders/elm-paladin/builds/3020 Observation: Elm is also on kernel v3.18, so it's likely the same cause. HWTest summary: 06-08-2017 [13:06:39] Start collecting test results and dump them to json. Suite job [ FAILED ] Suite job ABORT: cheets_CTS_N.7.1_r6.x86.CtsAccountManagerTestCases [ INFO ] cheets_CTS_N.7.1_r6.x86.CtsAccountManagerTestCases TEST_NA: Skipping: test not supported on this board/pool. cheets_CTS_N.7.1_r6.x86.CtsOpenGlPerf2TestCases [ INFO ] cheets_CTS_N.7.1_r6.x86.CtsOpenGlPerf2TestCases TEST_NA: Skipping: test not supported on this board/pool. cheets_CTS_N.7.1_r6.x86.CtsDramTestCases [ INFO ] cheets_CTS_N.7.1_r6.x86.CtsDramTestCases TEST_NA: Skipping: test not supported on this board/pool. cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases [ INFO ] cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases TEST_NA: Skipping: test not supported on this board/pool. provision [ FAILED ] provision FAIL: The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully provision [ FAILED ] provision FAIL: The host has wrong cros-version label., completed successfully
,
Jun 8 2017
Bisected the batch of DRM changes. Looks like this one caused the problem: https://chromium-review.googlesource.com/#/c/421984/
,
Jun 8 2017
,
Jun 8 2017
,
Jun 8 2017
,
Jun 9 2017
Builder: caroline-paladin Build: https://chromegw.corp.google.com/i/chromeos/builders/caroline-paladin/builds/206 Observation: The DUT reboot during test runs of cheets_StartAndroid.stress and graphics_dEQP. The output showed that there was error record of i915. This may be related to recent patches from Intel for drm/i915. Logs of graphics_dEQP failure from https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122453172-chromeos-test/chromeos2-row8-rack1-h 06/09 13:08:29.517 ERROR| logging_manager:0626| tko parser: {'builds': "{'cros-version': 'caroline-paladin/R61-9634.0.0-rc1'}", 'job_started': 1497037977, 'offload_failures_only': 'True', 'hostname': 'chromeos2-row8-rack1-host20', 'status_version': 1, 'label': 'caroline-paladin/R61-9634.0.0-rc1/bvt-cq/graphics_dEQP.gles31.info', 'parent_job_id': 122453081, 'drone': 'chromeos-server98.mtv.corp.google.com', 'user': 'chromeos-test', 'suite': 'bvt-cq', 'job_queued': 1497036221, 'experimental': 'False', 'build': 'caroline-paladin/R61-9634.0.0-rc1'} 06/09 13:08:29.520 ERROR| logging_manager:0626| tko parser: MACHINE NAME: chromeos2-row8-rack1-host20 06/09 13:08:29.553 ERROR| logging_manager:0626| tko parser: MACHINE GROUP: caroline 06/09 13:08:29.554 ERROR| logging_manager:0626| tko parser: parsing partial test ---- SERVER_JOB 06/09 13:08:29.555 ERROR| logging_manager:0626| tko parser: parsing partial test None CLIENT_JOB.0 06/09 13:08:29.556 ERROR| logging_manager:0626| tko parser: RUNNING: RUNNING 06/09 13:08:29.556 ERROR| logging_manager:0626| CLIENT_JOB.0 06/09 13:08:29.557 ERROR| logging_manager:0626| 06/09 13:08:29.557 ERROR| logging_manager:0626| tko parser: parsing partial test graphics_dEQP graphics_dEQP 06/09 13:08:29.558 ERROR| logging_manager:0626| tko parser: RUNNING: RUNNING 06/09 13:08:29.558 ERROR| logging_manager:0626| Subdir: graphics_dEQP 06/09 13:08:29.559 ERROR| logging_manager:0626| Testname: graphics_dEQP 06/09 13:08:29.559 ERROR| logging_manager:0626| 06/09 13:08:29.560 ERROR| logging_manager:0626| tko parser: update RUNNING reason: Autotest client terminated unexpectedly: DUT rebooted during the test run. 06/09 13:08:29.560 ERROR| logging_manager:0626| tko parser: parsing test graphics_dEQP graphics_dEQP 06/09 13:08:29.561 ERROR| logging_manager:0626| tko parser: ADD: FAIL 06/09 13:08:29.561 ERROR| logging_manager:0626| Subdir: graphics_dEQP 06/09 13:08:29.562 ERROR| logging_manager:0626| Testname: graphics_dEQP 06/09 13:08:29.563 ERROR| logging_manager:0626| Autotest client terminated unexpectedly: DUT rebooted during the test run. 06/09 13:08:29.563 ERROR| logging_manager:0626| tko parser: parsing test None CLIENT_JOB.0 06/09 13:08:29.564 ERROR| logging_manager:0626| tko parser: ADD: GOOD 06/09 13:08:29.564 ERROR| logging_manager:0626| Subdir: None 06/09 13:08:29.565 ERROR| logging_manager:0626| Testname: CLIENT_JOB.0 06/09 13:08:29.565 ERROR| logging_manager:0626| 06/09 13:08:29.566 ERROR| logging_manager:0626| tko parser: parsing test ---- SERVER_JOB Logs of cheets_StartAndroid.stress failure from https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122453098-chromeos-test/chromeos2-row8-rack1-host20/cheets_StartAndroid.stress/debug/: 06/09 12:35:38.446 INFO |cros_browser_backe:0123| Restarting Chrome (pid=12530) with remote port 06/09 12:35:38.447 INFO |cros_browser_backe:0130| dbus-send --system --type=method_call --dest=org.chromium.SessionManager /org/chromium/SessionManager org.chromium.SessionManagerInterface.EnableChromeTesting boolean:true array:string:"--disable-arc-opt-in-verification,--enable-net-benchmarking,--metrics-recording-only,--no-default-browser-check,--no-first-run,--enable-gpu-benchmarking,--disable-background-networking,--load-extension=/tmp/extension_LNLfD/autotest_private_ext\,/tmp/extension_AT4aK/autotest_private_ext\,/tmp/extension_gjUVf/autotest_private_ext,--disable-component-extensions-with-background-pages,--disable-default-apps,--disable-search-geolocation-disclosure,--enable-smooth-scrolling,--enable-threaded-compositing,--remote-debugging-port=59223,--start-maximized,--ash-disable-system-sounds,--allow-failed-policy-fetch-for-test,--oobe-skip-postlogin,--disable-logging-redirect,--vmodule=*/chromeos/net/*=2\,*/chromeos/login/*=2\,chrome_browser_main_posix=2,--disable-gaia-services" 06/09 12:35:38.447 DEBUG| cros_interface:0058| sh -c dbus-send --system --type=method_call --dest=org.chromium.SessionManager /org/chromium/SessionManager org.chromium.SessionManagerInterface.EnableChromeTesting boolean:true array:string:"--disable-arc-opt-in-verification,--enable-net-benchmarking,--metrics-recording-only,--no-default-browser-check,--no-first-run,--enable-gpu-benchmarking,--disable-background-networking,--load-extension=/tmp/extension_LNLfD/autotest_private_ext\,/tmp/extension_AT4aK/autotest_private_ext\,/tmp/extension_gjUVf/autotest_private_ext,--disable-component-extensions-with-background-pages,--disable-default-apps,--disable-search-geolocation-disclosure,--enable-smooth-scrolling,--enable-threaded-compositing,--remote-debugging-port=59223,--start-maximized,--ash-disable-system-sounds,--allow-failed-policy-fetch-for-test,--oobe-skip-postlogin,--disable-logging-redirect,--vmodule=*/chromeos/net/*=2\,*/chromeos/login/*=2\,chrome_browser_main_posix=2,--disable-gaia-services" 06/09 12:35:38.453 DEBUG| cros_interface:0067| > stdout=[], stderr=[] 06/09 12:35:38.474 DEBUG| cros_interface:0363| ListProcesses(<predicate>)->[202 processes] 06/09 12:35:38.475 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:38.476 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:38.577 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:38.679 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:38.781 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:38.882 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:38.984 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.086 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.188 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.290 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.393 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.495 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.598 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.713 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.838 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:39.975 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:40.127 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused 06/09 12:35:40.294 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused Logs of general test status: stderr: scp: /var/log/vmlog/vmlog.1.LATEST: No such file or directory scp: /var/log/vmlog/vmlog.1.PREVIOUS: No such file or directory 06/09 13:08:40.418 INFO | crashcollect:0071| Collecting /dev/pstore/console-ramoops... 06/09 13:08:40.420 DEBUG| ssh_host:0286| Running (ssh) 'ls -ld /dev/pstore/console-ramoops | cut -d" " -f5' 06/09 13:08:40.845 ERROR| utils:0298| [stderr] ls: cannot access /dev/pstore/console-ramoops: No such file or directory 06/09 13:08:40.893 WARNI| crashcollect:0085| Collection of /dev/pstore/console-ramoops failed: invalid literal for int() with base 10: '' 06/09 13:08:40.894 DEBUG| ssh_host:0286| Running (ssh) 'rm -rf /dev/pstore/console-ramoops' 06/09 13:08:41.331 INFO | crashcollect:0071| Collecting /sys/kernel/debug/dri/0/i915_error_state... 06/09 13:08:41.333 DEBUG| ssh_host:0286| Running (ssh) 'ls -ld /sys/kernel/debug/dri/0/i915_error_state | cut -d" " -f5' 06/09 13:08:41.755 DEBUG| utils:0298| [stdout] 0 06/09 13:08:41.799 DEBUG| ssh_host:0286| Running (ssh) 'mktemp -d' 06/09 13:08:42.267 DEBUG| ssh_host:0286| Running (ssh) 'cp -rp /sys/kernel/debug/dri/0/i915_error_state /tmp/tmp.Ue8VrMzGOm' 06/09 13:08:42.708 DEBUG| abstract_ssh:0346| get_file. source: /tmp/tmp.Ue8VrMzGOm/i915_error_state, dest: /usr/local/autotest/results/122453172-chromeos-test/chromeos2-row8-rack1-host20/crashinfo.chromeos2-row8-rack1-host20/i915_error_state, delete_dest: False,preserve_perm: False, preserve_symlinks:False 06/09 13:08:42.709 DEBUG| abstract_ssh:0357| Using Rsync. 06/09 13:08:42.710 DEBUG| utils:0203| Running 'rsync -L --timeout=1800 --rsh='/usr/bin/ssh -a -x -o ControlPath=/tmp/_autotmp_FcgQiHssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos2-row8-rack1-host20:"/tmp/tmp.Ue8VrMzGOm/i915_error_state" "/usr/local/autotest/results/122453172-chromeos-test/chromeos2-row8-rack1-host20/crashinfo.chromeos2-row8-rack1-host20/i915_error_state"'
,
Jun 9 2017
The failure I found yesterday with the i915/drm patches that were going through the CQ was related to https://chromium-review.googlesource.com/c/421984/ In that case, the OS image would immediately reboot on startup, causing a rollback. The caroline example here is different. Intel has not CQ +1'd any of their CLs again, so the problematic CL was not in this CQ run. It looks like tehre's a console_ramoops here: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122453098-chromeos-test/chromeos2-row8-rack1-host20/sysinfo/?pli=1 There's nothing super suspicious about the ramoops. It doesn't look like a crash. It looks like the system rebooted normally at around 30 seconds, unless the ramoops isn't capturing the reboot during the failure.
,
Jun 9 2017
Grabbed the one from 122453172 as well. Looks like a normal reboot. Not sure why it wasn't expected.
,
Jun 13 2017
Quick update from Intel. https://chromium-review.googlesource.com/#/c/421984/ causes this problem. The very next CL in the series fixes it: https://chromium-review.googlesource.com/#/c/421985/ I asked Intel to update 421984 to CQ-DEPEND on the next in the series so they don't get broken up.
,
Jun 13 2017
Issue 732434 has been merged into this issue.
,
Sep 20 2017
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by dgarr...@chromium.org
, Jun 8 2017There was a rollback after provision. It seems likely that there is a problem with some of the kernel CLs being tested. 2017/06/08 12:25:24.922 DEBUG| cros_update:0260| Error happens in CrOS auto-update: RootfsUpdateError('Build cave-paladin/R61-9631.0.0-rc1 failed to boot on 100.115.130.248; system rolled back to previous build',)