New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 765431 link

Starred by 1 user

Issue metadata

Status: Archived
Owner: ----
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: ----



Sign in to add a comment

When a shard finishes provision and reboot, it may show unexpected error

Project Member Reported by xixuan@chromium.org, Sep 14 2017

Issue description

In provision chromeos-server132.hot.corp.google.com, after it finishes provision and kick off a reboot, it loses connections, and later, after I press 'enter', it shows errors like:

Requested: reboot
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "reboot"

2017-09-14 14:52:13,609 ERRO| sudo() received nonzero return code -1 while executing!

Requested: reboot
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "reboot"
2017-09-14 14:52:13,609 INFO| Printing out task report.
{
  "sub_reports": [],
  "exception": "FabricException('sudo() received nonzero return code -1 while executing!\\n\\nRequested: reboot\\nExecuted: sudo -S -p \\'sudo password:\\'  /bin/bash -l -c \"reboot\"',)",
  "is_successful": false,
  "description": "sudo() received nonzero return code -1 while executing!\n\nRequested: reboot\nExecuted: sudo -S -p 'sudo password:'  /bin/bash -l -c \"reboot\"",
  "arguments_used": {
    "host_server": "chromeos-server132.hot"
  },
  "task_name": "ShardProvisionTask"
}

Assuming server132.hot has been provisioned successfully, seems this error msg shouldn't show up.
 
check the /etc/resolv.conf file on that server, If you find

search cros.corp.google.com mtv.corp.google.com corp.google.com prod.corp.google.com prodz.corp.google.com

then the server is provisioned successfully

If you find this:
search corp.google.com prod.google.com prodz.google.com google.com

Then the reboot failure actually means the resolv.conf is not updated. The reboot at the end will only be triggered when the resolv.conf is not the one we want. A reboot will apply the customized resolv.conf file to the server. 

Is there any better approach we can use rather than reboot to apply the customized resolv.conf file, Allen?

Comment 2 by xixuan@chromium.org, Sep 14 2017

The /etc/resolv.conf in chromeos-server132.hot is:

corp.google.com prod.google.com prodz.google.com google.com

So it should fail reboot or it's not successfully rebooted so that the resolv.conf is not changed? How should I fix it?
it is not reboot successfully, for quick fix, you can login to the server and run 'sudo reboot'

After it is back up, double check the /etc/resolv.conf file to make sure it is updated.

Comment 4 by xixuan@chromium.org, Sep 14 2017

The fix works~

Just provide another data point:
In provision chromeos-server133.hot, it rebooted successfully then told me /etc/resolv.conf is missing cros.corp.google.com.

2017-09-14 15:39:34,944 ERRO| The provisioned server does not have correct DNS config, cros.corp.google.com is missing from /etc/resolv.conf file. Fix it, then the server is provisioned successfully.
2017-09-14 15:39:34,945 INFO| Printing out task report.
{
  "sub_reports": [],
  "exception": "TaskRunFailure('The provisioned server does not have correct DNS config, cros.corp.google.com is missing from /etc/resolv.conf file. Fix it, then the server is provisioned successfully.',)",
  "is_successful": false,
  "description": "The provisioned server does not have correct DNS config, cros.corp.google.com is missing from /etc/resolv.conf file. Fix it, then the server is provisioned successfully.",
  "arguments_used": {
    "host_server": "chromeos-server133.hot"
  },
  "task_name": "ShardProvisionTask"

But when I log into it and run 'cat /etc/resolv.conf', 'cros.corp**' actually exists:
search cros.corp.google.com mtv.corp.google.com corp.google.com prod.corp.google.com prodz.corp.google.com
Status: Archived (was: Untriaged)

Sign in to add a comment