New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 673024 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Dec 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug-Regression



Sign in to add a comment

deploy_puppet failing for new devservers

Project Member Reported by jashur@chromium.org, Dec 9 2016

Issue description

I'm trying to run fabric and it's returning an error. It looks like it's having a problem with matching the hostname:

The error I am receiving:

[android1758-infra-devserver1-2] out: Error: Could not find class chromeos::na for android1758-infra-devserver1-2.cros.corp.google.com on node android1758-infra-devserver1-2.cros.corp.google.com
[android1758-infra-devserver1-2] out: Error: Could not find class chromeos::na for android1758-infra-devserver1-2.cros.corp.google.com on node android1758-infra-devserver1-2.cros.corp.google.com
[android1758-infra-devserver1-2] out:


I used the following commands for each devserver:

jashur@jashur:~/lab/chromeos-admin/fabric$

fab -H android1758-infra-devserver1-2 -u chromeos-test deploy_puppet
fab -H android1758-infra-devserver2-2 -u chromeos-test deploy_puppet
fab -H android1758-infra-devserver3-2 -u chromeos-test deploy_puppet


I received the following full output for each:

jashur@jashur:~/lab/chromeos-admin/fabric$ fab -H android1758-infra-devserver1-2 -u chromeos-test deploy_puppet
[android1758-infra-devserver1-2] Executing task 'deploy_puppet'
[android1758-infra-devserver1-2] put: /usr/local/google/home/jashur/lab/chromeos-admin/fabric/netrc -> /root/.netrc
[android1758-infra-devserver1-2] sudo: apt-get -y update
[android1758-infra-devserver1-2] sudo: apt-get -y install puppet git-core
[android1758-infra-devserver1-2] sudo: rm -rf /root/chromeos-admin
[android1758-infra-devserver1-2] out: sudo password:
[android1758-infra-devserver1-2] out: 
[android1758-infra-devserver1-2] sudo: HOME=/root/ git clone https://chrome-internal.googlesource.com/chromeos/chromeos-admin
[android1758-infra-devserver1-2] sudo: /root/chromeos-admin/puppet/run_puppet
[android1758-infra-devserver1-2] out: sudo password:
[android1758-infra-devserver1-2] out: Notice: Scope(Class[Chromeos::Roles]): ROLE: na
[android1758-infra-devserver1-2] out: Notice: Scope(Class[Chromeos::Roles]): PROD: 
[android1758-infra-devserver1-2] out: Error: Could not find class chromeos::na for android1758-infra-devserver1-2.cros.corp.google.com on node android1758-infra-devserver1-2.cros.corp.google.com
[android1758-infra-devserver1-2] out: Error: Could not find class chromeos::na for android1758-infra-devserver1-2.cros.corp.google.com on node android1758-infra-devserver1-2.cros.corp.google.com
[android1758-infra-devserver1-2] out: 


Fatal error: sudo() received nonzero return code 1 while executing!

Requested: /root/chromeos-admin/puppet/run_puppet
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "/root/chromeos-admin/puppet/run_puppet"

Aborting.
Disconnecting from android1758-infra-devserver1-2... done.
 
Summary: deploy_puppet failing for new devservers (was: Unable to Successfully Run Fabric)
The problem isn't fabric; the remote hosts are somehow not
set up properly for deploy_puppet.

I'm working to see why...

It looks like the rules for recognizing devservers don't
recognize the particular host name pattern.

From puppet/modules/facter/server_type.rb:
      when /android1758-infra-devserver.\..*/;  'devserver'

The FQDN looks like this:
    android1758-infra-devserver1-2.cros.corp.google.com
The "-2" is the source of the trouble.  We can either change
the hostname, or we can change the rule in puppet.

Status: Assigned (was: Untriaged)
The reason why we chose that host pattern is because we created a better configuration for our devservers, therefore we will be swapping out devserver1, devserver2, and devserver3, and eventually changing the temporary hostname back to the original.

We can go ahead and temporarily name them devserver10, devserver11, and devserver12 to deploy. Once successfully verified, we can go ahead and change the hostnames back to devserver1, devserver2, and devserver3 after the swap has taken place.
Easier to change the puppet rules right now, I think.

I'm working to upload a CL right now.


Project Member

Comment 6 by bugdroid1@chromium.org, Dec 12 2016

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/6621b116e32eb3ceb31a86cf720f013b6ff4e56d

commit 6621b116e32eb3ceb31a86cf720f013b6ff4e56d
Author: Richard Barnette <jrbarnette@google.com>
Date: Fri Dec 09 22:55:07 2016

Comment 7 by jashur@chromium.org, Dec 12 2016

Not sure if it's related but I received the following error (for all 3 devservers)after running:
fab -H android1758-infra-devserver1-2 -u chromeos-test deploy_puppet

Error:
[android1758-infra-devserver1-2] out: Notice: /Stage[main]/Lab::Devserver/Exec[Install minidump_stackwalk]/returns: subprocess.CalledProcessError: Command '('/usr/bin/python', '/home/chromeos-test/depot_tools/gclient.py', 'sync')' returned non-zero exit status 1
[android1758-infra-devserver1-2] out: Error: 
[android1758-infra-devserver1-2] out:       set -e
[android1758-infra-devserver1-2] out:       rm -rf breakpad
[android1758-infra-devserver1-2] out:       mkdir breakpad
[android1758-infra-devserver1-2] out:       cd breakpad
[android1758-infra-devserver1-2] out:       /home/chromeos-test/depot_tools/fetch breakpad
[android1758-infra-devserver1-2] out:       cd src
[android1758-infra-devserver1-2] out:       ./configure
[android1758-infra-devserver1-2] out:       make
[android1758-infra-devserver1-2] out:       make install
[android1758-infra-devserver1-2] out:       rm -rf breakpad
[android1758-infra-devserver1-2] out:      returned 1 instead of one of [0]
[android1758-infra-devserver1-2] out: Error: /Stage[main]/Lab::Devserver/Exec[Install minidump_stackwalk]/returns: change from notrun to 0 failed: 
[android1758-infra-devserver1-2] out:       set -e
[android1758-infra-devserver1-2] out:       rm -rf breakpad
[android1758-infra-devserver1-2] out:       mkdir breakpad
[android1758-infra-devserver1-2] out:       cd breakpad
[android1758-infra-devserver1-2] out:       /home/chromeos-test/depot_tools/fetch breakpad
[android1758-infra-devserver1-2] out:       cd src
[android1758-infra-devserver1-2] out:       ./configure
[android1758-infra-devserver1-2] out:       make
[android1758-infra-devserver1-2] out:       make install
[android1758-infra-devserver1-2] out:       rm -rf breakpad
[android1758-infra-devserver1-2] out:      returned 1 instead of one of [0]
[android1758-infra-devserver1-2] out: Notice: /Stage[main]/Lab::Devserver/Package[python-protobuf]/ensure: ensure changed 'purged' to 'latest'

I can paste the full output if requested.



Owner: akes...@chromium.org
The failure in c#7 probably has a different root cause from the
earlier failure, and will need a separate code fix.

Passing this to this week's deputy for follow-up.

Status: Fixed (was: Assigned)
Looks like an unrelated issue, could you file a new bug and cc me and ayatane@

Sign in to add a comment