New issue
Advanced search Search tips
Starred by 2 users
Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment
auto-update failed with StatefulUpdateError
Project Member Reported by nxia@chromium.org, Nov 16 Back to list
https://luci-milo.appspot.com/buildbot/chromeos/whirlwind-paladin/9913


https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/156573034-chromeos-test/chromeos4-row10-jetstream-host1/?project=chromeos-bot


StatefulUpdateError: ('failed to setgoodkernel: %s.', RunCommandError("cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmp1hNcMU/testing_rsa', 'root@chromeos4-row10-jetstream-host1', '--', '/usr/sbin/chromeos-setgoodkernel'], extra env={'LC_MESSAGES': 'C'}", <chromite.lib.cros_build_lib.CommandResult object at 0x7f5a185dfbd0>, None)), 1) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpfHuZpd/testing_rsa', 'root@100.115.219.23', '--', 'sync', '&&', 'sleep', '3', '&&', 'ectool', 'reboot_ec', 'cold', 'at-shutdown', '&&', 'shutdown', '-H', 'now'], extra env={'LC_MESSAGES': 'C'}, 
  Traceback (most recent call last):
    File "/usr/local/autotest/client/common_lib/test.py", line 831, in _call_test_function
      return func(*args, **dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 495, in execute
      dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 362, in _call_run_once_with_retry
      postprocess_profiled_run, args, dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 400, in _call_run_once
      self.run_once(*args, **dargs)
    File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 121, in run_once
      with_cheets=with_cheets)
    File "/usr/local/autotest/server/afe_utils.py", line 124, in machine_install_and_update_labels
      *args, **dargs)
    File "/usr/local/autotest/server/hosts/cros_host.py", line 816, in machine_install_by_devserver
      force_original=force_original)
    File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 2380, in auto_update
      error_msg % (host_name, real_error))
  DevServerException: CrOS auto-update failed for host chromeos4-row10-jetstream-host1: 0) StatefulUpdateError: ('failed to setgoodkernel: %s.', RunCommandError("cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmp1hNcMU/testing_rsa', 'root@chromeos4-row10-jetstream-host1', '--', '/usr/sbin/chromeos-setgoodkernel'], extra env={'LC_MESSAGES': 'C'}", <chromite.lib.cros_build_lib.CommandResult object at 0x7f5a185dfbd0>, None)), 1) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpfHuZpd/testing_rsa', 'root@100.115.219.23', '--', 'sync', '&&', 'sleep', '3', '&&', 'ectool', 'reboot_ec', 'cold', 'at-shutdown', '&&', 'shutdown', '-H', 'now'], extra env={'LC_MESSAGES': 'C'}, 
 
Cc: jintao@chromium.org
Cc: haddowk@chromium.org
Comment 3 Deleted
Owner: cywang@chromium.org
Status: Assigned
Seems chromeos-setgoodkernel cannot work at some boards?
Thanks for the revert, I am going to check if a board does not have the ectool command, then we will fallback to normal reboot.
shuqianz@ has pushed the revert CL to devservers. The devservers are good now. Leave this bug open to cywang@.
I will dive into the details from the metrics.
provisioning failures.png
1.4 MB View Download
Cc: bccheng@chromium.org
Although we could fallback to normal reboot flow once the ectool command failed, however, I tested the 'ectool console' command on 92 boards and found 27 failures.. Hmmm, this becomes more flaky than the original issue that we intend to discover the network dongle probing...

=====
butterfly-chromeos6-row1-rack3-host19.stderr
daisy-chromeos2-row4-rack8-host22.stderr
daisy_skate-chromeos4-row9-rack6-host13.stderr
daisy_spring-chromeos6-row2-rack16-host14.stderr
gale-chromeos6-row21-jetstream-host7.stderr
guado-chromeos1-row4-rack7-host5.stderr
guado_moblab-chromeos2-row1-rack8-host3.stderr
link-chromeos2-row10-rack5-host11.stderr
lumpy-chromeos6-row2-rack8-host5.stderr
mccloud-chromeos4-row2-rack11-host19.stderr
monroe-chromeos4-row13-rack12-host12.stderr
panther-chromeos1-row4-rack9-host6.stderr
parrot-chromeos6-row2-rack4-host3.stderr
parrot_ivb-chromeos6-row2-rack4-host11.stderr
polaris-autotest8-6cd51748.vrlab.stderr
reef-chromeos6-row3-rack12-host11.stderr
rikku-chromeos1-row4-rack5-host4.stderr
stout-chromeos6-row2-rack5-host8.stderr
stumpy-chromeos4-row2-rack9-host8.stderr
tidus-chromeos4-row3-rack2-host7.stderr
tricky-chromeos1-row3-rack5-host4.stderr
veyron_fievel-chromeos4-row8-rack12-host14.stderr
veyron_mickey-chromeos1-row4-rack9-host2.stderr
veyron_rialto-chromeos2-row1-rack10-host3.stderr
veyron_tiger-chromeos6-row4-rack8-host1.stderr
whirlwind-chromeos6-row22-jetstream-host4.stderr
zako-chromeos1-row4-rack9-host5.stderr

problematic.ec
7.5 KB Download
Cc: hungte@chromium.org
Sign in to add a comment