HWTest [provision] fails after provision_AutoUpdate with "The host has wrong cros-version label." |
|||||
Issue descriptionMany informational PFQ builders started failing in the HWTest [provision] stage with "[Test-Logs]: provision: FAIL: The host has wrong cros-version label., completed successfully", starting at these builds: http://uberchromegw/i/chromeos.chrome/builders/tricky-tot-chrome-pfq-informational/builds/7737 http://uberchromegw/i/chromeos.chrome/builders/peach_pit-tot-chrome-pfq-informational/builds/9377 http://uberchromegw/i/chromeos.chrome/builders/caroline-tot-chrome-pfq-informational/builds/791 http://uberchromegw/i/chromeos.chrome/builders/veyron_minnie-tot-chrome-pfq-informational/builds/7096 The last successful run I see was at Jan 18 22:59 and the first failure was at Jan 19 00:50. Here's a failing status: START ---- provision timestamp=1516356187 localtime=Jan 19 02:03:07 GOOD ---- verify.servo_ssh timestamp=1516356191 localtime=Jan 19 02:03:11 GOOD ---- verify.update timestamp=1516356195 localtime=Jan 19 02:03:15 GOOD ---- verify.brd_config timestamp=1516356195 localtime=Jan 19 02:03:15 GOOD ---- verify.ser_config timestamp=1516356195 localtime=Jan 19 02:03:15 GOOD ---- verify.job timestamp=1516356196 localtime=Jan 19 02:03:16 GOOD ---- verify.servod timestamp=1516356202 localtime=Jan 19 02:03:22 GOOD ---- verify.pwr_button timestamp=1516356202 localtime=Jan 19 02:03:22 GOOD ---- verify.lid_open timestamp=1516356203 localtime=Jan 19 02:03:23 GOOD ---- verify.PASS timestamp=1516356203 localtime=Jan 19 02:03:23 START provision_AutoUpdate provision_AutoUpdate timestamp=1516356203 localtime=Jan 19 02:03:23 START ---- ---- timestamp=1516356212 localtime=Jan 19 02:03:32 GOOD ---- sysinfo.before timestamp=1516356212 localtime=Jan 19 02:03:32 END GOOD ---- ---- timestamp=1516356212 localtime=Jan 19 02:03:32 GOOD provision_AutoUpdate provision_AutoUpdate timestamp=1516356609 localtime=Jan 19 02:10:09 completed successfully END GOOD provision_AutoUpdate provision_AutoUpdate timestamp=1516356609 localtime=Jan 19 02:10:09 GOOD ---- verify.ssh timestamp=1516356609 localtime=Jan 19 02:10:09 GOOD ---- verify.devmode timestamp=1516356609 localtime=Jan 19 02:10:09 GOOD ---- verify.hwid timestamp=1516356610 localtime=Jan 19 02:10:10 GOOD ---- verify.power timestamp=1516356611 localtime=Jan 19 02:10:11 GOOD ---- verify.ext4 timestamp=1516356611 localtime=Jan 19 02:10:11 GOOD ---- verify.writable timestamp=1516356611 localtime=Jan 19 02:10:11 GOOD ---- verify.tpm timestamp=1516356612 localtime=Jan 19 02:10:12 GOOD ---- verify.good_au timestamp=1516356612 localtime=Jan 19 02:10:12 GOOD ---- verify.fwstatus timestamp=1516356612 localtime=Jan 19 02:10:12 GOOD ---- verify.rwfw timestamp=1516356613 localtime=Jan 19 02:10:13 GOOD ---- verify.python timestamp=1516356613 localtime=Jan 19 02:10:13 FAIL ---- verify.cros timestamp=1516356616 localtime=Jan 19 02:10:16 The host has wrong cros-version label. END FAIL ---- provision timestamp=1516356616 localtime=Jan 19 02:10:16 INFO ---- ---- timestamp=1516356616 job_abort_reason= localtime=Jan 19 02:10:16 ---- Here's what I see at the end of the test logs for https://uberchromegw.corp.google.com/i/chromeos.chrome/builders/tricky-tot-chrome-pfq-informational/builds/7737: ... 01/19 02:04:20.319 INFO | dev_server:1494| Name: /home/chromeos-test/images/tricky-tot-chrome-pfq-informational/R65-10321.0.0-c2223837/.test_suitesDOTtarDOTbz2 Accessed: 2018-01-19 02:01:54.592407 Modified: 2018-01-19 02:01:49.784417 Size: 24964 bytes. 01/19 02:04:21.288 INFO | dev_server:0707| Getting devservers for host: chromeos4-row2-rack3-host15 01/19 02:04:25.102 INFO | dev_server:1856| Received response from devserver for cros_au call: '[true, 163027]' 01/19 02:10:04.870 INFO | afe_store:0099| adding labels: ['cros-version:tricky-tot-chrome-pfq-informational/R65-10321.0.0-c2223837'] ---- dshi@, you wrote this code (in 2015), right? Do you have any ideas about what's going wrong? Issue 540757 and issue 607601 are about the same error message.
,
Jan 19 2018
Created a revert at https://crrev.com/c/876243.
,
Jan 19 2018
Note: I'm confident that my CL caused this. And I now understand why the CQ let this through. It generates build versions differently. I don't yet understand WHY the lab breaks with the new format of string, but that's a different issue.
,
Jan 22 2018
Is it possible to add a test to the CQ that would've caught this?
,
Jan 22 2018
If we clean up/simplify how labels are generated/used then yes. As it is, no. There are way, way to many special cases. I have improved the relevant lab unittest with my proposed quick lab fix, but it will still leave things pretty fragile.
,
Jan 23 2018
Richard's "right fix" proposal (see crbug.com/800562 ) would remove many special cases, and so this would then have been caught in the CQ's hwtests. This outage is fixed, so closing and moving work back to the before mentioned bug.
,
Feb 15 2018
Reopen this issue. We have the same issue in TreeHugger presubmit. b/73367536 phobbs@ thinks this may be the same. Could you please take a look? For example: https://android-build.googleplex.com/builds/where-is-my-cl/googleplex-android-review/3609503?revision=6&revisionAttempt=1 F provision.provision 13:02.000 The host has wrong cros-version label., completed successfully, servod not running on chromeos6-row4-rack5-labstation1 port 9993 F provision.provision 14:25.000 Servo initialize timed out., The host has wrong cros-version label., completed successfully F provision.provision 14:54.000 The host has wrong cros-version label., completed successfully, servod not running on chromeos6-row2-rack24-labstation1 port 9994 https://atp.googleplex.com/test_runs/15167185
,
Feb 15 2018
My changes have been relanded, so probably. We need someone with expertise in the autotest adb_host to help resolve.
,
Feb 15 2018
Well.... this is the CL that is probably related. https://crrev.com/c/907326
,
Feb 15 2018
That CL is the cause of the difficulty, but the new arc-presubmit failures aren't this bug. I've opened bug 812467 ; we'll put the discussion there. |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by derat@chromium.org
, Jan 19 2018Owner: dgarr...@chromium.org