New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 812467 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

TreeHugger presubmit failures - "The host has wrong cros-version label."

Reported by jrbarnette@chromium.org, Feb 15 2018

Issue description

Images sent to the arc-presubmit pool testing are failing with
messages like this:
F
provision.provision     13:02.000
 The host has wrong cros-version label., completed successfully, servod not running on chromeos6-row4-rack5-labstation1 port 9993

Initial report of this problem is on b/73367536.

 

Comment 1 by khmel@chromium.org, Feb 15 2018

Cc: lgcheng@google.com
An initial example is here:
    https://atp.googleplex.com/test_runs/15167185

Logs there show that the ARC testing runs a command like this:
    /usr/local/autotest/site_utils/run_suite.py --pool arc-presubmit --timeout_mins 60 --suite_name arc-unit-test --board kevin --build kevin-release/R66-10403.0.0 --cheets_build git_nyc-mr1-arc/cheets_arm-user/P5373039

The key is that the build requested is "kevin-release/R66-10403.0.0"

Going next to the history of a kevin DUT that actually runs these
tests, you can see stuff like this:
    $ dut-status -d 2 -f chromeos6-row1-rack24-host3 | grep provision | tail -1
        2018-02-14 16:57:22  -- http://cautotest.corp.google.com/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack24-host3/71566-provision/

That provision failure does show the symptom.  Digging into
debug/autoserv.DEBUG:
02/14 16:57:25.863 DEBUG|          autoserv:0694| autoserv command was: /usr/local/autotest/server/autoserv -p -r /usr/local/autotest/results/hosts/chromeos6-row1-rack24-host3/71566-provision/20181402165635 -m chromeos6-row1-rack24-host3 --verbose --lab True --provision --job-labels cros-version:kevin-release/R66-10403.0.0-cheetsth,cheets-version:git_nyc-mr1-arc/cheets_arm-user/P5373694

So, the provision job actually executed asked for "cros-version:kevin-release/R66-10403.0.0-cheetsth".

Once the provision completes, and it gets to the verifier, you see
this:
02/14 17:13:13.799 DEBUG|             utils:0282| [stdout] CHROMEOS_RELEASE_BUILDER_PATH=kevin-release/R66-10403.0.0

So, the provision asked for and installed "kevin-release/R66-10403.0.0".
Unfortunately, the label we assigned didn't match the build.  Thus
the failure.

I don't know why the build and the label are different, but basically,
that's wrong.  That is, the change from dgarrett@ that uncovered this
is probably correct, and the bug is elsewhere.

However, given that arc-presubmit is broken "right" and "wrong" aren't
the most useful designations.  For now, to get things going, I think we
have at least two options:
  * Revert https://crrev.com/c/907326.
  * Hack that change to make a specific exception for labels ending
    in "-cheetsth"

Long term, I think the fix is to leave the verifier check as it is
right now, and fix the code that's causing the "-cheetsth" suffix to
get added to the label.

Owner: dgarr...@chromium.org
Passing back to dgarrett@ to weigh in on which hack he
prefers to address the immediate problem.

Comment 4 by kcwu@chromium.org, Feb 15 2018

Cc: kcwu@chromium.org posciak@chromium.org
Does this qualify as P0? It's blocking a large number of people from landing changes.

Comment 6 Deleted

Comment 7 Deleted

Sorry for the spam, monorail has been spuriously sending 500s
Project Member

Comment 9 by bugdroid1@chromium.org, Feb 16 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/147dda4fa08543a2c82d8bb43fe37c0c90168435

commit 147dda4fa08543a2c82d8bb43fe37c0c90168435
Author: Richard Barnette <jrbarnette@chromium.org>
Date: Fri Feb 16 00:31:41 2018

[autotest] Ignore "-cheetsth" in cros-version labels.

ARC pre-submit testing for a given build "X" requests a version
label named "X-cheetsth".  That breaks with the new check that
requires that the version and the version label match exactly.

This fixes the check to ignore the "-cheetsth" suffix, if it
exists.

BUG= chromium:812467 
TEST=Invoke the method from python CLI

Change-Id: Ibd3ffb80f2a260d70520174d4b6e0ec8b2d70500
Reviewed-on: https://chromium-review.googlesource.com/922446
Reviewed-by: Richard Barnette <jrbarnette@google.com>
Commit-Queue: Richard Barnette <jrbarnette@google.com>
Tested-by: Richard Barnette <jrbarnette@google.com>

[modify] https://crrev.com/147dda4fa08543a2c82d8bb43fe37c0c90168435/server/hosts/cros_host.py

We've committed a change that should stop the symptom:
    https://chromium-review.googlesource.com/#/c/chromiumos/third_party/autotest/+/922446/

That change will need to be pushed to prod in the Autotest lab;
normally, that would most likely happen tomorrow AM.

The change was pushed to the lab late yesterday afternoon, and
after resolving an unrelated problem, I believe that the change
should now be live for arc-presubmit.

So, I believe that this problem is fixed.  I'd like to hold this
open long enough so that we can sort out what to do about fixing
the underlying root cause.

Comment 12 by sbasi@chromium.org, Feb 26 2018

Cc: pprabhu@chromium.org
Re: why "-cheetsth" is appended to the build label.

Dan and I made this decision with Prathmesh as the cheets provision is actually modifying the rootfs of the chrome os image (by swapping the android system image). Therefore we didn't want other (regular chrome os) tests targetting the same base Chrome OS system image to reuse the DUT as-is because it technically no longer the same image installed.

We can revisit this if need be but that's the reasoning behind it.
> [ ... ] the cheets provision is actually modifying the rootfs [ ... ]

Modifying the root FS?  That's ... awkward.

The proper behavior is to create a builder that takes the android bits
and combines them with the Chrome OS bits at build time.  Then, the
builder name would be in the cros-version label.  Then this wouldn't
be happening.

Either that or full remove the label, since it's not running a well defined build. That prevents it from being reused, of course.

Comment 15 by sbasi@chromium.org, Feb 26 2018

Note this is for TH *Presubmit* (one build per cl) so the goal was to have a low turnaround time. I guess the system could have kicked off a trybot run but then the chrome os build system would receive significant new load and the time to do a paladin build (~30-40 mins right?) would be added to the turnaround time for the cl patch testing.


As for removing the label, that could work. We kept the label as the dynamic suite logic has tests getting assigned devices based off certain labels. But I guess they could leverage the cheets-version label (these are unique per presubmit run) instead, so that might be a viable solution.
Labels: -Pri-1 Pri-2
Owner: jrbarnette@chromium.org

Comment 17 by kcwu@chromium.org, Mar 8 2018

Cc: -kcwu@chromium.org -posciak@chromium.org
Cc: rohi...@chromium.org
Status: Fixed (was: Assigned)
Closing this in favor of bug 824581.

Sign in to add a comment