New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 614843 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: May 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug

Blocked on:
issue 614890



Sign in to add a comment

lakitu_next pre-cqs running on wrong builders

Project Member Reported by adityakali@google.com, May 25 2016

Issue description

pre-cq tests are failing for lakitu_next board. 
See the attempts on https://chrome-internal-review.googlesource.com/#/c/259660/ 
Ex: https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/pre-cq/builds/30747

The VMTests fail with error:

Starting a KVM instance
Could not access KVM kernel module: No such file or directory

The trybot run for the same patch passes:
https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/incremental/builds/165

The theory is that pre-cq is running on 'cros-standard26-c2' builder which cannot run VMtests whereas trybot correctly runs on 'build275-m2' builders.

Can chromeos-infra deputy please confirm if this is indeed the root cause and help us address this? This is currently blocking commits to lakitu_next. 

FYI, change https://chrome-internal-review.googlesource.com/#/c/258963/ recently enabled running pre-cq for lakitu_next board. 

 
Cc: dgarr...@chromium.org
That looks to be the case, I'm going to try and login to the builders to see and also check how to specify a particular buildslave for the pre-cq.
Yes, that test build clearly ran on a GCE instance which won't work with VM Tests enabled. I'm going to try and track down how the trybot waterfall decides which builds go where.
Has the trybot waterfall been restarted since the lakitu config changes went in?

Reading through the trybot logic, it should do the right thing because it examines the build configs to decide what goes where. However, it only re-reads config changes when the waterfall is restarted.
Blockedon: 614890
I've restarted the waterfall restart that I think will fix this. Blocking this bug on the restart bug.
In the Chrome Infra codebase, see:
  build/masters/master.chromiumos.tryserver/chromiumos_tryserver_util.py.

This is the important bit of code:
  precq_builders = set(
      v['_template'] or k for k, v in configs.iteritems() if v.IsPreCqBuilder())
  precq_novmtest_builders = set(
      v['_template'] or k for k, v in configs.iteritems()
      if v.IsPreCqBuilder() and not v.HasVmTests() and not v.HasHwTests())

Thanks for looking into this.

lakitu changes were done in CL:344288 and CL:*258963. I don't know if waterfall was restarted since then.

Any idea of ETA for this?

Comment 7 by aga...@chromium.org, May 26 2016

Components: -Infra Infra>Client>ChromeOS
Hopefully, this now fixed. Can someone run a tryjob to confirm?
The latest pre-cq run once again got scheduled on "cros-standard18-c2" machine: https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/pre-cq/builds/31073

Probably going to fail again?
Yep. I'm asking questions on the blocking bug. Maybe the restart hasn't really happened, only been scheduled. Or maybe the chromite ping wasn't bumped up before the restart.
The pin wasn't bumped (yes, confusing terminology). Nodir@ will bump it and restart that waterfall again. After that's done (and it may take a while), I expect this to be fixed.
Owner: dgarr...@chromium.org
Status: Fixed (was: Untriaged)
The trybot waterfall was pin bumped and restarted, and it looks like the CL above passed PreCQ testing.

I think this is fixed!
Thanks! The pre-cq seems to be fixed.
Though it seems the lakitu_next-release builder started failing https://uberchromegw.corp.google.com/i/chromeos/builders/lakitu_next-release and the reason is unclear. Actually, only the 'cbuildbot' step is marked as failed, but the image build and test and all other steps are successful.
Not sure if its related to the restart.

I'm certain those issues are 100% unrelated, but looking.
Closing... please feel free to reopen if its not fixed.
Status: Verified (was: Fixed)

Sign in to add a comment