samus-release: key_verify failed for server_host_key |
||||
Issue descriptionThe samus-release builder has been consistently failing. One example of a failing build: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8939220612054347072 It's a little difficult to find the actual failure, but one message I've seen consistently in the errors of the last three builds is: key_verify failed for server_host_key One set of potentially interesting looking errors is here: https://00e9e64bacac34fc7dd480a8aa424978171798c51996e4765c-apidata.googleusercontent.com/download/storage/v1/b/chromeos-autotest-results/o/223526513-chromeos-test%2Fchromeos4-row12-rack6-host3%2Fdebug%2Fautoserv.ERROR?qk=AD5uMEsNr_GwWJU3Utx6zyCZcCjMQEdUyGCNmHOTLDcQZlrLc2JAogjgcqJ78QvMxecEl0DCMFGGV3vKuYwiyRK59hBCaM-r53c_UmEK_w10x7FNwYM9UW2aETUDdb-ogL6B89b319F_kmdB4i6E7HJBztsF8i7Lj0fZtlc9IOL8QTMRo5SyXXtyBY3GVAsAdUtLGkqbFSHnoJbgXe1A5ehWV4XleBF5oVfMsG26nebBzZ1ofUr9A58wCeYJDwdR4VgdLPRlMYeDpP9ub8kHW1N4bXASdWtXPKHaGVDZ6GAjm685YCJcJxshnytmFGB21nYEpuANfw0EZEH8fCPFSyzQJjcc4I8qXE7jA9g6K0Etpgp3d6kRI2Ob8NxazNKPRuKu57u3Uj_NEIz4Q19fqu7lb0DTb_vc8gywkNmgA8bKfONVJH70gYYTnME5OCGXrrElILnyGPXhGUl6PbQXeeh5Y9ZfgQV7VmfkCSpVi9_BlrqpTDYDCnjnPq39Yu6kXm64MSc9xatD9vDquNWIKUv4MPI4Y5gXySn5jjfNV6JQ-P_1tn2eFHDhMNbmG9skyXjSQ-dq_v67KOxhzfXd93u_DN_a4DDLhOK0uhjeMwe9L3ihEbyRCkyVS-WO8dcYtksg5cvR5wCgKREH1LOkpocokKHuv4J_OMq9GFReV4Gxbt-FBK8XEmUbLB74JMQi25ZkHeFglSajtNjwhgHwMYNMDO9d6Gt811A5dObv1MVUTrdyra7g63kyuRq4C7E7qzIHwRvzlisfPY_MkH6gZ9eTJ0drfLE0hqpXydIMqM4NcK77aU36yfES7yVbOAZXvkhM2sTJ_Z1c_q9y07oOa5AcHFV7wrmseQ In my searches I only see this message produced as a result of SSH. Is it possible that the DUT SSHes out at times during the build, and it's got some sort of out of date set of keys for the hosts it connects to?
,
Aug 6
This seems likely to be a result of a stale testing_rsa, which is updated when you use it. This would be consistent with some but not all people having it work immediately. Testing locally, I don't have a file in the specified testing_rsa location, but it used another copy of the file I do have.
,
Aug 6
I don't see any invocation of ssh in the test being run there, though it is an autoupdate test so I would not be surprised if there is one a couple more archaelogical digs down from where I checked. I'll look into it.
,
Aug 6
jrbarnette@ suggests that this could be caused by a powerwash invalidating the credentials, because the DUT's identity changes
,
Aug 6
When Justin an I investigated, we both logged onto the same machine. The sequence was something like: 1) Justin fail 2) Justin succeed 3) Justin succeed ...a minute passes. Reboot is possible, but I don't think we rebooted. 4) Evan fail 5) Evan succeed 6) Evan succeed 7) Guenter succeed (I asked someone else to try it just to see if it was the first time for everyone). Also, IIRC from looking at the build history this happens at random/unpredictable points throughout the build.
,
Aug 7
The reboot would be affecting the machine sending the request (Justin/Evan/Guenter), not the one being requested to. However, this problem seems to be much newer than the changes that introduced the powerwashing to this test (which were added in May) and this problem has only surfaced in July and later.
,
Aug 7
My understanding is that this blocks release of new versions for samus. Is that correct? If so, the priority bump and Chase seem appropriate.
,
Aug 13
Seems confined to certain DUTs or certain time window, removing from chase queue.
,
Aug 28
This failure blocked at least on PFQ run: https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/suiteDetails?suiteId=231387884 Quite a few tests looked failing in the past due to this, but all of them seem to be from the same single DUT "chromeos4-row12-rack6-host3" https://stainless.corp.google.com/search?view=list&first_date=2018-08-14&last_date=2018-08-28&suite=bvt&board=samus&status=GOOD&status=WARN&status=FAIL&status=ERROR&reason=AutoservRunError%3A+command+execution+error&exclude_cts=false&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false (click Columns => Host)
,
Dec 12
|
||||
►
Sign in to add a comment |
||||
Comment 1 by evgreen@chromium.org
, Aug 3