autoupdate_EndToEndTest.paygen_au_dev_full flaky on multiple canary builds |
||||||||||
Issue descriptionBuilders failed on: https://luci-milo.appspot.com/buildbot/chromeos/veyron_mickey-release/1466 autoupdate_EndToEndTest.paygen_au_dev_full FAIL: The update appears to have completed successfully but we found a problem while verifying the hostlog of events returned from the update. Some attributes reported for the initial update check event are not what we expected: ['version']. The expected version is (9887.0.0) but reported version was (9914.0.0). The source payload we installed was probably incorrect or corrupt. Check the full hostlog for this update in the devserver_hostlog_rootfs file in the autoupdate_logs directory. https://luci-milo.appspot.com/buildbot/chromeos/monroe-release/2086 autoupdate_EndToEndTest.paygen_au_dev_delta FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row13-rack10-host4: RootfsUpdateError: Failed to perform rootfs update: RootfsUpdateError('Update failed with unexpected update status: UPDATE_STATUS_IDLE',) https://uberchromegw.corp.google.com/i/chromeos/builders/celes-release/builds/1465/steps/PaygenTestCanary/logs/stdio 15:28:46: ERROR: pre-kill notification (SIGXCPU); traceback: 15:29:17: INFO: Translating result (15, 'Received signal 15; shutting down') to fail. This might be different from Issue 730141 as the devices are not in the list on the description of Issue 730141 .
,
Sep 12 2017
,
Sep 12 2017
,
Sep 12 2017
xixuan@, can you take a look? It seems the provision failed to deploy a correct version?
,
Sep 12 2017
@david, could you check this paygen_au_dev_full failure? seems sth wrong with verifying the hostlog.
,
Sep 13 2017
This is causing Simple Chrome compile failures since there is not a recent LATEST file for anything > 9914.0.0 and there was a breaking GN configuration change post 9914.0.0. Given the importance placed on eve, this is blocking a lot of devs. Escalating priority on this.
,
Sep 13 2017
There are a few different issues in this bug. Regarding this failure: FAIL: The update appears to have completed successfully but we found a problem while verifying the hostlog of events returned from the update. Some attributes reported for the initial update check event are not what we expected: ['version']. The expected version is (9887.0.0) but reported version was (9914.0.0). The source payload we installed was probably incorrect or corrupt. Check the full hostlog for this update in the devserver_hostlog_rootfs file in the autoupdate_logs directory. This can happen when we update rootfs from X -> Y. Then applying stateful fails. We retry to update to Y but since the rootfs was successfully updated we update from Y -> Y. We return the hostlog from this update which is not what the test expects. I have updated the error message for when this happens in the refactor CL: https://chromium-review.googlesource.com/#/c/chromiumos/third_party/autotest/+/654064/
,
Sep 13 2017
For eve though, the most common failure reason for paygen_au_canary and paygen_au_dev lately is this one: Unhandled DevServerException: CrOS auto-update failed for host chromeos2-row4-rack1-host10: RootfsUpdateError: After update and reboot, update-engine failed to call chromeos-setgoodkernel within 120 seconds Which can happen if an update failed or it failed to apply the source image in the test. Checking the logs for eve...
,
Sep 13 2017
and this failure reason too: Unhandled DevServerException: CrOS auto-update failed for host <DUT HOSTNAME>: SSHConnectionError: ssh: connect to host <DUT HOSTNAME> port 22: Connection timed out Both of these failures are because we try to apply the source image in the autoupdate test and the device doesn't come back to life after we apply stateful. So it times out with ssh or applying chromeos-setgoodkernel, whatever comes first. So it seems like the test is behaving correctly and there is a problem with the eve images and/or stateful lately. Not sure who to assign this to. ahassani@ do you know if there was any changes related to eve here?
,
Sep 13 2017
Eve has been flaky since Aug 11. Looks like there was a change to the eve touch firmware in the first failed build: https://crosland.corp.google.com/log/9831.0.0..9832.0.0 Could this be a culprit, jkwang@?
,
Sep 13 2017
Could not think of any reason that the fw update will trigger this.
,
Sep 13 2017
look at: crbug.com/761259
,
Sep 15 2017
One failure is fixed by #7 another is fixed by #12 so closing this
,
Sep 19 2017
|
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by seobrien@chromium.org
, Sep 12 2017Labels: Pri-1