New issue
Advanced search Search tips

Issue 807322 link

Starred by 0 users

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

peppy-chrome-pfq:4822 failed HWTest [sanity] provision FAIL: Unhandled DevServerException

Project Member Reported by xiy...@chromium.org, Jan 30 2018

Issue description

peppy-chrome-pfq:4822 failed

Builders failed on: 
- peppy-chrome-pfq: 
  https://luci-milo.appspot.com/buildbot/chromeos/peppy-chrome-pfq/4822

Autotest instance created: cautotest-prod
01-30-2018 [03:30:43] Created suite job: http://cautotest-prod/afe/#tab_id=view_job&object_id=173341199
The suite job has another 2:29:52.036471 till timeout.
01-30-2018 [04:10:00] Suite job is finished.
01-30-2018 [04:10:00] Start collecting test results and dump them to json.
Suite job   [ PASSED ]
provision   [ FAILED ]
provision     FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row6-rack12-host5: 0) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmp8LavtS/testing_rsa', 'root@chromeos4-row6-rack12-host5', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.rgYvGnRWwI/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.rgYvGnRWwI/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}, 1) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmp5dVwRT/testing_rsa', 'root@100.115.197.145', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.C5GPTi5qi7/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.C5GPTi5qi7/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}, 
Suite timings:
Downloads started at 2018-01-30 03:30:32
Payload downloads ended at 2018-01-30 03:30:41
Suite started at 2018-01-30 03:30:48
Artifact downloads ended (at latest) at 2018-01-30 03:30:54
Testing started at 2018-01-30 03:32:04
Testing ended at 2018-01-30 04:06:51


 

Comment 1 by xiy...@chromium.org, Jan 30 2018

Status: WontFix (was: Untriaged)
Looks like the error is transient. It is gone now.

Comment 2 by xiy...@chromium.org, Jan 31 2018

Status: Untriaged (was: WontFix)
Re-open. Happened again.

https://luci-milo.appspot.com/buildbot/chromeos/peppy-chrome-pfq/4824
https://luci-milo.appspot.com/buildbot/chromeos/peppy-chrome-pfq/4826
https://luci-milo.appspot.com/buildbot/chromeos/peppy-chrome-pfq/4827
https://luci-milo.appspot.com/buildbot/chromeos/peppy-chrome-pfq/4828

From build #4828:

In status.log:
====
	START	provision_AutoUpdate	provision_AutoUpdate	timestamp=1517400270	localtime=Jan 31 04:04:30	
		START	----	----	timestamp=1517400281	localtime=Jan 31 04:04:41	
			GOOD	----	sysinfo.before	timestamp=1517400281	localtime=Jan 31 04:04:41	
		END GOOD	----	----	timestamp=1517400281	localtime=Jan 31 04:04:41	
		FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1517402414	localtime=Jan 31 04:40:14	Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row6-rack12-host5: 0) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpfqzX3K/testing_rsa', 'root@chromeos4-row6-rack12-host5', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}, 1) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpO9UcE8/testing_rsa', 'root@100.115.197.145', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.Aoc4JB3hSL/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.Aoc4JB3hSL/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}, 
  Traceback (most recent call last):
    File "/usr/local/autotest/client/common_lib/test.py", line 831, in _call_test_function
      return func(*args, **dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 495, in execute
      dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 362, in _call_run_once_with_retry
      postprocess_profiled_run, args, dargs)
    File "/usr/local/autotest/client/common_lib/test.py", line 400, in _call_run_once
      self.run_once(*args, **dargs)
    File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 127, in run_once
      with_cheets=with_cheets)
    File "/usr/local/autotest/server/afe_utils.py", line 124, in machine_install_and_update_labels
      *args, **dargs)
    File "/usr/local/autotest/server/hosts/cros_host.py", line 829, in machine_install_by_devserver
      quick_provision=quick_provision)
    File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 2494, in auto_update
      error_msg % (host_name, real_error))
  DevServerException: CrOS auto-update failed for host chromeos4-row6-rack12-host5: 0) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpfqzX3K/testing_rsa', 'root@chromeos4-row6-rack12-host5', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}, 1) cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpO9UcE8/testing_rsa', 'root@100.115.197.145', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.Aoc4JB3hSL/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.Aoc4JB3hSL/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}, 
	END FAIL	provision_AutoUpdate	provision_AutoUpdate	timestamp=1517402414	localtime=Jan 31 04:40:14	

In provision_AutoUpdate debug log:
====
01/31 04:22:40.681 DEBUG|        dev_server:2436| Exception raised on auto_update attempt #1:
 Traceback (most recent call last):
  File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 345, in TriggerAU
    self._StatefulUpdate(chromeos_AU)
  File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 171, in _StatefulUpdate
    cros_updater.UpdateStateful()
  File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 796, in UpdateStateful
    raise StatefulUpdateError(error_msg % e)
StatefulUpdateError: Failed to perform stateful partition update: return code: 1; command: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpfqzX3K/testing_rsa root@chromeos4-row6-rack12-host5 -- sh /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz '--stateful_change=clean'
cmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpfqzX3K/testing_rsa', 'root@chromeos4-row6-rack12-host5', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}
01/31 04:22:40.681 DEBUG|        dev_server:2442| Please see error details in log /usr/local/autotest/results/hosts/chromeos4-row6-rack12-host5/765755-provision/20183101040411/autoupdate_logs/CrOS_update_chromeos4-row6-rack12-host5_120497.log

In CrOS_update_chromeos4-row6-rack12-host5_120497.log mentioned. The rootfs update seems done but stateful reset failed. The last piece in the log reads:
====
2018/01/31 04:20:28.406 DEBUG|      auto_updater:0641| Resetting stateful partition...
2018/01/31 04:20:28.407 DEBUG|    cros_build_lib:0593| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpfqzX3K/testing_rsa root@chromeos4-row6-rack12-host5 -- sh /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update '--stateful_change=reset'
Warning: Permanently added 'chromeos4-row6-rack12-host5,100.115.197.145' (ED25519) to the list of known hosts.
Warning: Permanently added 'chromeos4-row6-rack12-host5,100.115.197.145' (ED25519) to the list of known hosts.
Resetting stateful update state.
2018/01/31 04:20:28.815 INFO |      auto_updater:0789| Updating stateful partition with clobber enabled...
2018/01/31 04:20:28.816 DEBUG|    cros_build_lib:0593| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpfqzX3K/testing_rsa root@chromeos4-row6-rack12-host5 -- sh /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz '--stateful_change=clean'
Warning: Permanently added 'chromeos4-row6-rack12-host5,100.115.197.145' (ED25519) to the list of known hosts.
Warning: Permanently added 'chromeos4-row6-rack12-host5,100.115.197.145' (ED25519) to the list of known hosts.
Reading local payload /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz
Successfully retrieved update.
Missing var or dev_image in stateful payload.
2018/01/31 04:22:29.097 ERROR|      auto_updater:0793| Stateful update failed.
2018/01/31 04:22:29.097 DEBUG|      auto_updater:0641| Resetting stateful partition...
2018/01/31 04:22:29.098 DEBUG|    cros_build_lib:0593| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpfqzX3K/testing_rsa root@chromeos4-row6-rack12-host5 -- sh /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update '--stateful_change=reset'
Warning: Permanently added 'chromeos4-row6-rack12-host5,100.115.197.145' (ED25519) to the list of known hosts.
Warning: Permanently added 'chromeos4-row6-rack12-host5,100.115.197.145' (ED25519) to the list of known hosts.
Resetting stateful update state.
2018/01/31 04:22:32.190 DEBUG|    cros_build_lib:0593| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpfqzX3K/testing_rsa root@chromeos4-row6-rack12-host5 -- rm -rf /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl
2018/01/31 04:22:33.498 DEBUG|       cros_update:0351| Error happens in CrOS auto-update: StatefulUpdateError("Failed to perform stateful partition update: return code: 1; command: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpfqzX3K/testing_rsa root@chromeos4-row6-rack12-host5 -- sh /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz '--stateful_change=clean'\ncmd=['ssh', '-p', '22', '-oConnectionAttempts=4', '-oUserKnownHostsFile=/dev/null', '-oProtocol=2', '-oConnectTimeout=30', '-oServerAliveCountMax=3', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=10', '-oNumberOfPasswordPrompts=0', '-oIdentitiesOnly=yes', '-i', '/tmp/ssh-tmpfqzX3K/testing_rsa', 'root@chromeos4-row6-rack12-host5', '--', 'sh', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful_update', '/mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.tRLOOWA6Fl/stateful.tgz', '--stateful_change=clean'], extra env={'LC_MESSAGES': 'C'}",)

jkop@, could you help to triage?

Comment 3 by xiy...@chromium.org, Jan 31 2018

Cc: newcomer@chromium.org
Labels: OS-Chrome

Comment 4 by xiy...@chromium.org, Jan 31 2018

Cc: -jkop@chromium.org
Owner: jkop@chromium.org
Status: Assigned (was: Untriaged)

Comment 6 by jkop@chromium.org, Feb 5 2018

Owner: xixuan@chromium.org
Status: WontFix (was: Assigned)
Something wrong with DUT chromeos4-row6-rack12-host5 and it should be sent to repair.

However, currently this DUT is already in repair_failed state, which means it won't be used in hwtest. So 'wontfix' this bug for now.

If it happened again with another DUT but the same error, please open it again.

Comment 8 by glevin@chromium.org, Mar 13 2018

Cc: glevin@chromium.org wzang@chromium.org
Status: Available (was: WontFix)
Happened again, same device:
https://luci-milo.appspot.com/buildbot/chromeos/peppy-chrome-pfq/5141
Labels: Hotlist-Deputy
Owner: ayatane@chromium.org
Status: Assigned (was: Available)
https://b.corp.google.com/issues/74773419

Will get it repaired and see how it works.  The DUT history looks suspicious though, maybe the DUT HDD/SSD is dead, then it should fail repair.
Status: Fixed (was: Assigned)

Sign in to add a comment