New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 653363 link

Starred by 1 user

Issue metadata

Status: Archived
Owner: ----
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

provision failure | RootfsUpdateError('After update and reboot, update-engine failed to call chromeos-setgoodkernel within 120 seconds',)

Project Member Reported by akes...@chromium.org, Oct 6 2016

Issue description

Build: https://uberchromegw.corp.google.com/i/chromeos/builders/cyan-paladin/builds/566

Failure log: https://storage.cloud.google.com/chromeos-autotest-results/79449667-chromeos-test/chromeos4-row6-rack9-host12/sysinfo/CrOS_update_100.115.197.86_9940.log?_ga=1.125721737.26102705.1449633781

2016/10/05 13:52:28.238 DEBUG|    cros_build_lib:0565| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpmqSgQN/testing_rsa root@100.115.197.86 -- status system-services
2016/10/05 13:52:28.555 DEBUG|    cros_build_lib:0614| (stdout):
system-services start/running

2016/10/05 13:52:28.556 DEBUG|    cros_build_lib:0616| (stderr):
Warning: Permanently added '100.115.197.86' (RSA) to the list of known hosts.

2016/10/05 13:52:28.556 DEBUG|      auto_updater:0775| System services_status: 'system-services start/running\n'
2016/10/05 13:52:28.557 DEBUG|    cros_build_lib:0565| RunCommand: ssh -p 22 '-oConnectionAttempts=4' '-oUserKnownHostsFile=/dev/null' '-oProtocol=2' '-oConnectTimeout=30' '-oServerAliveCountMax=3' '-oStrictHostKeyChecking=no' '-oServerAliveInterval=10' '-oNumberOfPasswordPrompts=0' '-oIdentitiesOnly=yes' -i /tmp/ssh-tmpmqSgQN/testing_rsa root@100.115.197.86 -- rm -rf /mnt/stateful_partition/unencrypted/preserve/cros-update/tmp.clPJnBka3u
2016/10/05 13:52:28.910 DEBUG|       cros_update:0224| Error happens in CrOS auto-update: RootfsUpdateError('After update and reboot, update-engine failed to call chromeos-setgoodkernel within 120 seconds',)



(note, the same provision job also failed on a different AU attempt with slightly different reason)
 
Cc: jrbarnette@chromium.org
Strangely, the follow-up repair job didn't seem to actually take an repair action. Am I reading that wrong?

https://storage.cloud.google.com/chromeos-autotest-results/hosts/chromeos4-row6-rack9-host12/216058-repair/20160510135454/status.log?_ga=1.67123218.26102705.1449633781

START	----	repair	timestamp=1475700902	localtime=Oct 05 13:55:02	
	GOOD	----	verify.ssh	timestamp=1475700903	localtime=Oct 05 13:55:03	
	GOOD	----	verify.brd_config	timestamp=1475700904	localtime=Oct 05 13:55:04	
	GOOD	----	verify.ser_config	timestamp=1475700904	localtime=Oct 05 13:55:04	
	GOOD	----	verify.job	timestamp=1475700904	localtime=Oct 05 13:55:04	
	GOOD	----	verify.servod	timestamp=1475700908	localtime=Oct 05 13:55:08	
	GOOD	----	verify.pwr_button	timestamp=1475700908	localtime=Oct 05 13:55:08	
	GOOD	----	verify.lid_open	timestamp=1475700909	localtime=Oct 05 13:55:09	
	GOOD	----	verify.update	timestamp=1475700915	localtime=Oct 05 13:55:15	
	GOOD	----	verify.PASS	timestamp=1475700915	localtime=Oct 05 13:55:15	
	GOOD	----	verify.ssh	timestamp=1475700924	localtime=Oct 05 13:55:24	
	GOOD	----	verify.power	timestamp=1475700925	localtime=Oct 05 13:55:25	
	GOOD	----	verify.cros	timestamp=1475700929	localtime=Oct 05 13:55:29	
	GOOD	----	verify.good_au	timestamp=1475700929	localtime=Oct 05 13:55:29	
	GOOD	----	verify.writable	timestamp=1475700930	localtime=Oct 05 13:55:30	
	GOOD	----	verify.tpm	timestamp=1475700930	localtime=Oct 05 13:55:30	
	GOOD	----	verify.python	timestamp=1475700930	localtime=Oct 05 13:55:30	
	GOOD	----	verify.PASS	timestamp=1475700930	localtime=Oct 05 13:55:30	
	INFO	----	repair	timestamp=1475700930	localtime=Oct 05 13:55:30	Can't repair label 'cros-version:cyan-paladin/R55-8866.0.0-rc1'.
	INFO	----	repair	timestamp=1475700930	localtime=Oct 05 13:55:30	Can't repair label 'pool:cq'.
	INFO	----	repair	timestamp=1475700930	localtime=Oct 05 13:55:30	Can't repair label 'arc'.
	INFO	----	repair	timestamp=1475700930	localtime=Oct 05 13:55:30	Can't repair label 'board:cyan'.
END GOOD	----	repair	timestamp=1475700930	localtime=Oct 05 13:55:30	chromeos4-row6-rack9-host12 repaired successfully


> Strangely, the follow-up repair job didn't seem to
> actually take an repair action. Am I reading that wrong?

<sigh> We don't have a verifier that will detect this
particular failure.  The plausible change is to add a check for
"the primary partition has been marked good by setgoodkernel".
We shouldn't be testing a DUT if that isn't true.

In any event, is this bug about the provision failure, or
the suspicious repair?  Regarding the provision failure, it
would be caused by a bug in the build just installed.
Ok, I inteded this to be more about the provision failure. Any idea what component of the just-installed build would be responsible?
I looked through the CLs, and I couldn't find any obvious culprit.
Whatever it was, it had to be code that runs at boot time.
Ok, so presumably it's an existing flake in ToT. Let's keep this open in case it happens again.

Comment 6 by autumn@chromium.org, Oct 11 2016

Labels: -current-issue
Status: Unconfirmed (was: Untriaged)
Project Member

Comment 7 by sheriffbot@chromium.org, Oct 12 2017

Status: Archived (was: Unconfirmed)
Issue has not been modified or commented on in the last 365 days, please re-open or file a new bug if this is still an issue.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Sign in to add a comment