Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 3 users
Status: WontFix
Owner:
Closed: Sep 20
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment
cave-paladin: tests aborted with timeout and provision failed with "The host has wrong cros-version label"
Project Member Reported by mcchou@chromium.org, Jun 8 Back to list
Builder: cave-paladin

Build:   https://chromegw.corp.google.com/i/chromeos/builders/cave-paladin/builds/529

Observation:
It seems that dev_server failed to perform AU and the image rolled back to previous build, so the provision complain about cros-version.

HWTest summary:
06-08-2017 [13:01:00] Start collecting test results and dump them to json.
  Suite job                                    [ FAILED ]
  Suite job                                      ABORT: 
  cheets_CTS_N.7.1_r6.x86.CtsDramTestCases     [ FAILED ]
  cheets_CTS_N.7.1_r6.x86.CtsDramTestCases       ABORT: Timed out, did not run.
  cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases   [ FAILED ]
  cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases     ABORT: Timed out, did not run.
  provision_AutoUpdate.double                  [ FAILED ]
  provision_AutoUpdate.double                    ABORT: Timed out, did not run.
  security_ModuleLocking                       [ FAILED ]
  security_ModuleLocking                         ABORT: Timed out, did not run.
  login_UserPolicyKeys                         [ FAILED ]
  login_UserPolicyKeys                           ABORT: Timed out, did not run.
  security_ASLR                                [ FAILED ]
  security_ASLR                                  ABORT: Timed out, did not run.
  login_OwnershipApi                           [ FAILED ]
  login_OwnershipApi                             ABORT: Timed out, did not run.
  login_RemoteOwnership                        [ FAILED ]
  login_RemoteOwnership                          ABORT: Timed out, did not run.
  security_ChromiumOSLSM                       [ FAILED ]
  security_ChromiumOSLSM                         ABORT: Timed out, did not run.
  logging_CrashSender                          [ FAILED ]
  logging_CrashSender                            ABORT: Timed out, did not run.
  desktopui_KillRestart                        [ FAILED ]
  desktopui_KillRestart                          ABORT: Timed out, did not run.
  build_RootFilesystemSize                     [ FAILED ]
  build_RootFilesystemSize                       ABORT: Timed out, did not run.
  security_AccountsBaseline                    [ FAILED ]
  security_AccountsBaseline                      ABORT: Timed out, did not run.
  security_SuidBinaries                        [ FAILED ]
  security_SuidBinaries                          ABORT: Timed out, did not run.
  security_SysLogPermissions                   [ FAILED ]
  security_SysLogPermissions                     ABORT: Timed out, did not run.
  security_RuntimeExecStack                    [ FAILED ]
  security_RuntimeExecStack                      ABORT: Timed out, did not run.
  security_OpenFDs                             [ FAILED ]
  security_OpenFDs                               ABORT: Timed out, did not run.
  security_EnableChromeTesting                 [ FAILED ]
  security_EnableChromeTesting                   ABORT: Timed out, did not run.
  security_SymlinkRestrictions                 [ FAILED ]
  security_SymlinkRestrictions                   ABORT: Timed out, did not run.
  provision                                    [ FAILED ]
  provision                                      FAIL: The host has wrong cros-version label., completed successfully
  provision                                    [ FAILED ]
  provision                                      FAIL: Unhandled AutoservError: No answer to ping from chromeos2-row8-rack7-host11
  provision                                    [ FAILED ]
  provision                                      FAIL: The host has wrong cros-version label., completed successfully
  provision                                    [ FAILED ]
  provision                                      FAIL: Unhandled AutoservError: No answer to ping from chromeos2-row8-rack7-host18

Logs from https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122220560-chromeos-test/chromeos6-row2-rack18-host16/debug/:

06/08 11:37:41.284 DEBUG|        dev_server:2132| Exception raised on auto_update attempt #1:
 Traceback (most recent call last):
  File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 257, in TriggerAU
    chromeos_AU.PostCheckCrOSUpdate()
  File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1241, in PostCheckCrOSUpdate
    'build' % (self.update_version, self.device.hostname))
  File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1021, in _VerifyBootExpectations
    raise RootfsUpdateError(rollback_message)
RootfsUpdateError: Build cave-paladin/R61-9631.0.0-rc1 failed to boot on chromeos6-row2-rack18-host16; system rolled back to previous build

06/08 11:46:14.008 DEBUG|        dev_server:2132| Exception raised on auto_update attempt #2:
 Traceback (most recent call last):
  File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 257, in TriggerAU
    chromeos_AU.PostCheckCrOSUpdate()
  File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1241, in PostCheckCrOSUpdate
    'build' % (self.update_version, self.device.hostname))
  File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 1021, in _VerifyBootExpectations
    raise RootfsUpdateError(rollback_message)
RootfsUpdateError: Build cave-paladin/R61-9631.0.0-rc1 failed to boot on 100.115.131.2; system rolled back to previous build
06/08 11:46:14.010 DEBUG|        dev_server:2138| Please see error details in log /usr/local/autotest/results/hosts/chromeos6-row2-rack18-host16/627197-provision/20170806112713/autoupdate_logs/CrOS_update_100.115.131.2_31628.log

06/08 12:04:31.765 DEBUG|        dev_server:2132| Exception raised on auto_update attempt #3:
No JSON object could be decoded
06/08 12:04:31.767 DEBUG|        dev_server:2138| Please see error details in log /usr/local/autotest/results/hosts/chromeos6-row2-rack18-host16/627197-provision/20170806112713/autoupdate_logs/CrOS_update_100.115.131.2_0.log
 
There was a rollback after provision. It seems likely that there is a problem with some of the kernel CLs being tested.

2017/06/08 12:25:24.922 DEBUG|       cros_update:0260| Error happens in CrOS auto-update: RootfsUpdateError('Build cave-paladin/R61-9631.0.0-rc1 failed to boot on 100.115.130.248; system rolled back to previous build',)
I've gone ahead and contacted the Intel developers in charge of the DRM patches that were being landed in this run, and marked the first of their series as V-1.

https://chromium-review.googlesource.com/421935

I've also grabbed the build and will see if it boots.
Owner: bleung@chromium.org
 crbug.com/731274  was filed to improve reporting of this in the future.
Indeed that build doesn't boot. But there's also no ramoops as well. I'm kicking it back to Intel to look into. Thanks.
Builder: elm-paladin

Build:   https://chromegw.corp.google.com/i/chromeos/builders/elm-paladin/builds/3020

Observation:
Elm is also on kernel v3.18, so it's likely the same cause.

HWTest summary:
06-08-2017 [13:06:39] Start collecting test results and dump them to json.
  Suite job                                            [ FAILED ]
  Suite job                                              ABORT: 
  cheets_CTS_N.7.1_r6.x86.CtsAccountManagerTestCases   [  INFO  ]
  cheets_CTS_N.7.1_r6.x86.CtsAccountManagerTestCases     TEST_NA: Skipping:  test not supported on this board/pool.
  cheets_CTS_N.7.1_r6.x86.CtsOpenGlPerf2TestCases      [  INFO  ]
  cheets_CTS_N.7.1_r6.x86.CtsOpenGlPerf2TestCases        TEST_NA: Skipping:  test not supported on this board/pool.
  cheets_CTS_N.7.1_r6.x86.CtsDramTestCases             [  INFO  ]
  cheets_CTS_N.7.1_r6.x86.CtsDramTestCases               TEST_NA: Skipping:  test not supported on this board/pool.
  cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases           [  INFO  ]
  cheets_CTS_N.7.1_r6.x86.CtsOpenGLTestCases             TEST_NA: Skipping:  test not supported on this board/pool.
  provision                                            [ FAILED ]
  provision                                              FAIL: The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: Servo initialize timed out., The host has wrong cros-version label., completed successfully
  provision                                            [ FAILED ]
  provision                                              FAIL: The host has wrong cros-version label., completed successfully
Bisected the batch of DRM changes. Looks like this one caused the problem:
https://chromium-review.googlesource.com/#/c/421984/


Status: Assigned
Cc: marc...@chromium.org
Components: OS>Kernel>Graphics OS>Kernel>Display
Cc: brian.j....@intel.com
Builder: caroline-paladin

Build:   https://chromegw.corp.google.com/i/chromeos/builders/caroline-paladin/builds/206

Observation:
The DUT reboot during test runs of cheets_StartAndroid.stress and graphics_dEQP. The output showed that there was error record of i915. This may be related to recent patches from Intel for drm/i915.

Logs of graphics_dEQP failure from https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122453172-chromeos-test/chromeos2-row8-rack1-h
06/09 13:08:29.517 ERROR|   logging_manager:0626| tko parser: {'builds': "{'cros-version': 'caroline-paladin/R61-9634.0.0-rc1'}", 'job_started': 1497037977, 'offload_failures_only': 'True', 'hostname': 'chromeos2-row8-rack1-host20', 'status_version': 1, 'label': 'caroline-paladin/R61-9634.0.0-rc1/bvt-cq/graphics_dEQP.gles31.info', 'parent_job_id': 122453081, 'drone': 'chromeos-server98.mtv.corp.google.com', 'user': 'chromeos-test', 'suite': 'bvt-cq', 'job_queued': 1497036221, 'experimental': 'False', 'build': 'caroline-paladin/R61-9634.0.0-rc1'}
06/09 13:08:29.520 ERROR|   logging_manager:0626| tko parser: MACHINE NAME: chromeos2-row8-rack1-host20
06/09 13:08:29.553 ERROR|   logging_manager:0626| tko parser: MACHINE GROUP: caroline
06/09 13:08:29.554 ERROR|   logging_manager:0626| tko parser: parsing partial test ---- SERVER_JOB
06/09 13:08:29.555 ERROR|   logging_manager:0626| tko parser: parsing partial test None CLIENT_JOB.0
06/09 13:08:29.556 ERROR|   logging_manager:0626| tko parser: RUNNING: RUNNING
06/09 13:08:29.556 ERROR|   logging_manager:0626| CLIENT_JOB.0
06/09 13:08:29.557 ERROR|   logging_manager:0626| 
06/09 13:08:29.557 ERROR|   logging_manager:0626| tko parser: parsing partial test graphics_dEQP graphics_dEQP
06/09 13:08:29.558 ERROR|   logging_manager:0626| tko parser: RUNNING: RUNNING
06/09 13:08:29.558 ERROR|   logging_manager:0626| Subdir: graphics_dEQP
06/09 13:08:29.559 ERROR|   logging_manager:0626| Testname: graphics_dEQP
06/09 13:08:29.559 ERROR|   logging_manager:0626| 
06/09 13:08:29.560 ERROR|   logging_manager:0626| tko parser: update RUNNING reason: Autotest client terminated unexpectedly: DUT rebooted during the test run.
06/09 13:08:29.560 ERROR|   logging_manager:0626| tko parser: parsing test graphics_dEQP graphics_dEQP
06/09 13:08:29.561 ERROR|   logging_manager:0626| tko parser: ADD: FAIL
06/09 13:08:29.561 ERROR|   logging_manager:0626| Subdir: graphics_dEQP
06/09 13:08:29.562 ERROR|   logging_manager:0626| Testname: graphics_dEQP
06/09 13:08:29.563 ERROR|   logging_manager:0626| Autotest client terminated unexpectedly: DUT rebooted during the test run.
06/09 13:08:29.563 ERROR|   logging_manager:0626| tko parser: parsing test None CLIENT_JOB.0
06/09 13:08:29.564 ERROR|   logging_manager:0626| tko parser: ADD: GOOD
06/09 13:08:29.564 ERROR|   logging_manager:0626| Subdir: None
06/09 13:08:29.565 ERROR|   logging_manager:0626| Testname: CLIENT_JOB.0
06/09 13:08:29.565 ERROR|   logging_manager:0626| 
06/09 13:08:29.566 ERROR|   logging_manager:0626| tko parser: parsing test ---- SERVER_JOB

Logs of cheets_StartAndroid.stress failure from https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122453098-chromeos-test/chromeos2-row8-rack1-host20/cheets_StartAndroid.stress/debug/:
06/09 12:35:38.446 INFO |cros_browser_backe:0123| Restarting Chrome (pid=12530) with remote port
06/09 12:35:38.447 INFO |cros_browser_backe:0130| dbus-send --system --type=method_call --dest=org.chromium.SessionManager /org/chromium/SessionManager org.chromium.SessionManagerInterface.EnableChromeTesting boolean:true array:string:"--disable-arc-opt-in-verification,--enable-net-benchmarking,--metrics-recording-only,--no-default-browser-check,--no-first-run,--enable-gpu-benchmarking,--disable-background-networking,--load-extension=/tmp/extension_LNLfD/autotest_private_ext\,/tmp/extension_AT4aK/autotest_private_ext\,/tmp/extension_gjUVf/autotest_private_ext,--disable-component-extensions-with-background-pages,--disable-default-apps,--disable-search-geolocation-disclosure,--enable-smooth-scrolling,--enable-threaded-compositing,--remote-debugging-port=59223,--start-maximized,--ash-disable-system-sounds,--allow-failed-policy-fetch-for-test,--oobe-skip-postlogin,--disable-logging-redirect,--vmodule=*/chromeos/net/*=2\,*/chromeos/login/*=2\,chrome_browser_main_posix=2,--disable-gaia-services"
06/09 12:35:38.447 DEBUG|    cros_interface:0058| sh -c dbus-send --system --type=method_call --dest=org.chromium.SessionManager /org/chromium/SessionManager org.chromium.SessionManagerInterface.EnableChromeTesting boolean:true array:string:"--disable-arc-opt-in-verification,--enable-net-benchmarking,--metrics-recording-only,--no-default-browser-check,--no-first-run,--enable-gpu-benchmarking,--disable-background-networking,--load-extension=/tmp/extension_LNLfD/autotest_private_ext\,/tmp/extension_AT4aK/autotest_private_ext\,/tmp/extension_gjUVf/autotest_private_ext,--disable-component-extensions-with-background-pages,--disable-default-apps,--disable-search-geolocation-disclosure,--enable-smooth-scrolling,--enable-threaded-compositing,--remote-debugging-port=59223,--start-maximized,--ash-disable-system-sounds,--allow-failed-policy-fetch-for-test,--oobe-skip-postlogin,--disable-logging-redirect,--vmodule=*/chromeos/net/*=2\,*/chromeos/login/*=2\,chrome_browser_main_posix=2,--disable-gaia-services" 
06/09 12:35:38.453 DEBUG|    cros_interface:0067|  > stdout=[], stderr=[]
06/09 12:35:38.474 DEBUG|    cros_interface:0363| ListProcesses(<predicate>)->[202 processes]
06/09 12:35:38.475 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:38.476 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:38.577 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:38.679 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:38.781 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:38.882 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:38.984 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.086 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.188 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.290 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.393 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.495 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.598 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.713 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.838 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:39.975 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:40.127 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused
06/09 12:35:40.294 INFO |devtools_client_ba:0057| Websocket at port 59223 not yet available: [Errno 111] Connection refused

Logs of general test status:
stderr:
scp: /var/log/vmlog/vmlog.1.LATEST: No such file or directory
scp: /var/log/vmlog/vmlog.1.PREVIOUS: No such file or directory
06/09 13:08:40.418 INFO |      crashcollect:0071| Collecting /dev/pstore/console-ramoops...
06/09 13:08:40.420 DEBUG|          ssh_host:0286| Running (ssh) 'ls -ld /dev/pstore/console-ramoops | cut -d" " -f5'
06/09 13:08:40.845 ERROR|             utils:0298| [stderr] ls: cannot access /dev/pstore/console-ramoops: No such file or directory
06/09 13:08:40.893 WARNI|      crashcollect:0085| Collection of /dev/pstore/console-ramoops failed: invalid literal for int() with base 10: ''
06/09 13:08:40.894 DEBUG|          ssh_host:0286| Running (ssh) 'rm -rf /dev/pstore/console-ramoops'
06/09 13:08:41.331 INFO |      crashcollect:0071| Collecting /sys/kernel/debug/dri/0/i915_error_state...
06/09 13:08:41.333 DEBUG|          ssh_host:0286| Running (ssh) 'ls -ld /sys/kernel/debug/dri/0/i915_error_state | cut -d" " -f5'
06/09 13:08:41.755 DEBUG|             utils:0298| [stdout] 0
06/09 13:08:41.799 DEBUG|          ssh_host:0286| Running (ssh) 'mktemp -d'
06/09 13:08:42.267 DEBUG|          ssh_host:0286| Running (ssh) 'cp -rp /sys/kernel/debug/dri/0/i915_error_state /tmp/tmp.Ue8VrMzGOm'
06/09 13:08:42.708 DEBUG|      abstract_ssh:0346| get_file. source: /tmp/tmp.Ue8VrMzGOm/i915_error_state, dest: /usr/local/autotest/results/122453172-chromeos-test/chromeos2-row8-rack1-host20/crashinfo.chromeos2-row8-rack1-host20/i915_error_state, delete_dest: False,preserve_perm: False, preserve_symlinks:False
06/09 13:08:42.709 DEBUG|      abstract_ssh:0357| Using Rsync.
06/09 13:08:42.710 DEBUG|             utils:0203| Running 'rsync -L  --timeout=1800 --rsh='/usr/bin/ssh -a -x   -o ControlPath=/tmp/_autotmp_FcgQiHssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22' -az --no-o --no-g root@chromeos2-row8-rack1-host20:"/tmp/tmp.Ue8VrMzGOm/i915_error_state" "/usr/local/autotest/results/122453172-chromeos-test/chromeos2-row8-rack1-host20/crashinfo.chromeos2-row8-rack1-host20/i915_error_state"'
The failure I found yesterday with the i915/drm patches that were going through the CQ was related to https://chromium-review.googlesource.com/c/421984/

In that case, the OS image would immediately reboot on startup, causing a rollback.

The caroline example here is different. Intel has not CQ +1'd any of their CLs again, so the problematic CL was not in this CQ run.

It looks like tehre's a console_ramoops here:
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/122453098-chromeos-test/chromeos2-row8-rack1-host20/sysinfo/?pli=1


There's nothing super suspicious about the ramoops. It doesn't look like a crash. It looks like the system rebooted normally at around 30 seconds, unless the ramoops isn't capturing the reboot during the failure.
122453098-chromeos-test%2Fchromeos2-row8-rack1-host20%2Fsysinfo%2Fconsole-ramoops.txt
46.4 KB View Download
Grabbed the one from 122453172 as well. Looks like a normal reboot. Not sure why it wasn't expected.
122453172-chromeos-test%2Fchromeos2-row8-rack1-host20%2Fsysinfo%2Fconsole-ramoops.txt
46.4 KB View Download
Quick update from Intel.

https://chromium-review.googlesource.com/#/c/421984/ causes this problem. The very next CL in the series fixes it: 
https://chromium-review.googlesource.com/#/c/421985/

I asked Intel to update 421984 to CQ-DEPEND on the next in the series so they don't get broken up.
Cc: sureshraj@chromium.org ihf@chromium.org hywu@chromium.org edcourtney@chromium.org radhakri...@intel.com yixiang@chromium.org kinaba@chromium.org yusukes@chromium.org
Issue 732434 has been merged into this issue.
Status: WontFix
Sign in to add a comment