New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 747085 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

paygen_au_canary_delta: RootfsUpdateError: After update and reboot, update-engine failed to call chromeos-setgoodkernel within 120 seconds

Project Member Reported by dhadd...@chromium.org, Jul 20 2017

Issue description

Affecting Lumpy only

 
We are still serving 9746.0.0 on canary for Lumpy:
https://cros-goldeneye.corp.google.com/chromeos/console/listOmaha

We have not been able to uprev because AU'ing from 9746.0.0 with delta payloads fails during the update everytime:
https://wmatrix.googleplex.com/platform/paygen_au_canary?platforms=lumpy

Reproduced locally too


Status: Assigned (was: Started)
Summary: paygen_au_canary_delta: RootfsUpdateError: After update and reboot, update-engine failed to call chromeos-setgoodkernel within 120 seconds (was: paygen_au_canary: RootfsUpdateError: After update and reboot, update-engine failed to call chromeos-setgoodkernel within 120 seconds)
The DELTA update goes INCREDIBLY slooooooooooow and then fails. 

https://wmatrix.googleplex.com/failures/paygen_au_canary?platforms=lumpy&builds=R61-9763.0.0

Failed to perform rootfs update: TimeoutError('Timeout occurred- waited 1800.0 seconds.',)

Which is 30 minutes. I ran into that locally too 
It looks like the update succeeded though
I haven't run into the error in the bug title yet though but I suspect it is also due to a really slow update 
Cc: ahass...@chromium.org
During the test, when we update to the source image with a full payload it goes really fast and we never hit the retry necessary for calling update_engine_client --status. It only happens when we apply the delta update afterwards 

+ahassani 
Cc: keta...@chromium.org dhadd...@chromium.org josa...@chromium.org
Similarly this failure is also about an update taking too long (though the error messaging is horrible and will be updated soon)

https://wmatrix.googleplex.com/failures/paygen_au_canary?platforms=lumpy&builds=R61-9756.0.0

So it seems that that there is something going on with lumpy that it is taking way longer than any other board to apply a delta update 
Owner: ahass...@chromium.org
Assigning to ahassani since he is taking a look into it 
Cc: kathrelk...@chromium.org
+milestone owner
Cc: grundler@chromium.org
I'm looking at the lumpy paygen failures, as you mentioned it takes a log time for the update to finish, but it actually finishes in more than 20 minutes. We already fixed this problem in https://chromium-review.googlesource.com/c/567360/ that makes user initiated/forced updates faster by not using O_DSYNC. It seems like the paygen is still using a version older than 9756. Is there any way to force the au test to use newer version?

Adding grundler@ if he has any opinion on this since he was originally added O_DSYNC.
Owner: bhthompson@chromium.org
You're right. We are still serving 9746.0.0 on canary for lumpy:
https://cros-goldeneye.corp.google.com/chromeos/console/listOmaha

This hasn't updated because the test keeps failing. Chicken and egg.

This is similar to the Bob failure
https://bugs.chromium.org/p/chromium/issues/detail?id=746583 

Bernie can you force Bob and lumpy to new canary builds in Omaha?
Status: Fixed (was: Assigned)
This should be live now, assuming this should be fixed.
Status: Verified (was: Fixed)

Sign in to add a comment