New issue
Advanced search Search tips

Issue 678667 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocked on:
issue 678793



Sign in to add a comment

chromite not updated on some builders?

Project Member Reported by semenzato@chromium.org, Jan 5 2017

Issue description

I found a log where the chromite code is still using the flawed REBOOT_MARKER method to check for reboot.  I fixed this about a month ago (issue 667541): is it possible that this builder is still using the old code?


https://uberchromegw.corp.google.com/i/chromeos/builders/edgar-release/builds/734/steps/HWTest%20%5Bsanity%5D/logs/stdio

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/94501594-chromeos-test/chromeos4-row12-rack8-host3/debug/

autoserv.DEBUG:


01/05 05:33:42.884 DEBUG|        dev_server:1723| Current CrOS auto-update status: pre-setup rootfs update
01/05 05:33:52.932 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/get_au_status?build_name=edgar-release/R57-9153.0.0&force_update=True&pid=30450&host_name=chromeos4-row12-rack8-host3&full_update=False"''
01/05 05:33:54.123 DEBUG|        dev_server:1785| Failed to trigger auto-update process on devserver
01/05 05:33:54.123 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/handler_cleanup?pid=30450&host_name=chromeos4-row12-rack8-host3"''
01/05 05:33:55.314 DEBUG|        base_utils:0185| Running 'ssh 100.115.219.134 'curl "http://100.115.219.134:8082/collect_cros_au_log?pid=30450&host_name=chromeos4-row12-rack8-host3"''
01/05 05:33:56.543 DEBUG|        dev_server:1617| Saving auto-update logs into /usr/local/autotest/results/hosts/chromeos4-row12-rack8-host3/534169-provision/20170501052348/autoupdate_logs/CrOS_update_chromeos4-row12-rack8-host3_30450.log
01/05 05:33:56.544 DEBUG|        dev_server:1884| Exception raised on auto_update attempt #1:
 Traceback (most recent call last):
  File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 222, in TriggerAU
    self._RootfsUpdate(chromeos_AU)
  File "/home/chromeos-test/chromiumos/src/platform/dev/cros_update.py", line 149, in _RootfsUpdate
    cros_updater.PreSetupRootfsUpdate()
  File "/home/chromeos-test/chromiumos/chromite/lib/auto_updater.py", line 904, in PreSetupRootfsUpdate
    self.device.Reboot(timeout_sec=self.REBOOT_TIMEOUT)
  File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 817, in Reboot
    return self.GetAgent().RemoteReboot(timeout_sec=timeout_sec)
  File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 380, in RemoteReboot
    self.RemoteSh('touch %s && reboot' % REBOOT_MARKER)
  File "/home/chromeos-test/chromiumos/chromite/lib/remote_access.py", line 340, in RemoteSh
    raise SSHConnectionError(e.result.error)
SSHConnectionError: Warning: Permanently added 'chromeos4-row12-rack8-host3,100.115.203.51' (RSA) to the list of known hosts.
Write failed: Broken pipe
 

Comment 1 by nxia@chromium.org, Jan 5 2017

Owner: xixuan@chromium.org
Can you paste the commit# of your fix? 
xixuan@, can you take a look at this?
https://crrev.com/1bc79b267dd32dc69a0f6f4dc873de333841a50c

Also available in the linked bug at #11.

Cc: xixuan@chromium.org
Owner: nxia@chromium.org
I cannot log into any servers. Due to log https://storage.cloud.google.com/chromeos-autotest-results/94501594-chromeos-test/chromeos4-row12-rack8-host3/debug/autoserv.DEBUG?_ga=1.21001784.2018250139.1482890129, It's devserver 100.115.219.134 to execute this failed provision. Deputy could check this devserver to see whether it's updated. If it's not (probably so), deputy could update all devservers with the newest chromite.

So reassign back to deputy :)

Comment 4 by nxia@chromium.org, Jan 5 2017

According to Kevin's latest push_to_prod, the change should have been pushed. 

chromite:
git log --oneline c2e9734..c559824
c559824 Use explicit virtualenv in virtualenv_wrapper


xixuan@, please paste the right steps to check the chromite version on devserver? I can help with checking.
Push-to-prod won't update devserver I think, only update drone&shard.

Log into the devserver and check the chromite directory, I don't remember where it exactly exists, but it should be at some very obvious path like ~/chromiumos/chromite. Check chromite/lib/remote_access.py, if it's not like the one in https://chromium-review.googlesource.com/#/c/413632/ and still has 'touch /tmp/awaiting_reboot && reboot' in it, this devserver's chromite is not updated well.

Comment 6 by nxia@chromium.org, Jan 5 2017

ok, I'll check.

Comment 7 by nxia@chromium.org, Jan 5 2017

The devserver chromite is behind the commit#, will update the devservers.

Comment 8 by nxia@chromium.org, Jan 5 2017

Blockedon: 678793

Comment 9 by nxia@chromium.org, Jan 9 2017

Cc: dgarr...@chromium.org
Owner: shuqianz@chromium.org
Updated the devservers, expect the following ones. 


"description": "Failed to update following devservers ['100.115.24.253', '172.25.65.106', '172.27.215.248', '172.27.215.252']",


shuqianz@, please advice how to update individual devservers.

The feature to support update individual devservers hasn't been added yet. CL https://chrome-internal-review.googlesource.com/#/c/310302/ is under review. For now, what you can do is to first debug why these devservers were fail to update, fix that and kick off another devserver update, which will update all devservers again. Most of the time, a given devserver failed to update because it was offline.  
Cc: ayatane@chromium.org
Owner: dgarr...@chromium.org
Pass to current deputy, '100.115.24.253', '172.27.215.248', '172.27.215.252' are not ssh-able.  '172.25.65.106' failed to update because the virtualenv didn't setup properly on this server, and it fail to update virtualenv. 

Here is the error log of '172.25.65.106':
[172.25.65.106] run: git stash
[172.25.65.106] out: /bin/bash: line 0: cd: /usr/local/google/chromeos/infra_virtualenv: No such file or directory
[172.25.65.106] out: 


Fatal error: run() received nonzero return code 1 while executing!

Requested: git stash
Executed: /bin/bash -l -c "cd /usr/local/google/chromeos/infra_virtualenv >/dev/null && git stash"

Aborting.
ERROR:root:Traceback (most recent call last):
  File "/usr/local/google/home/shuqianz/chromiumos/chromeos-admin/server-management-lib/server_management_lib/tasks/atomic_common.py", line 113, in decorated_func
    func(self)
  File "/usr/local/google/home/shuqianz/chromiumos/chromeos-admin/server-management-lib/server_management_lib/tasks/atomic_devserver.py", line 163, in run
    self._update_all_devservers()
  File "/usr/local/google/home/shuqianz/chromiumos/chromeos-admin/server-management-lib/server_management_lib/tasks/atomic_devserver.py", line 157, in _update_all_devservers
    'Failed to update following devservers %s' % fail_devservers)
TaskRunFailure: Failed to update following devservers ['172.25.65.106']

ERROR:root:Failed to update following devservers ['172.25.65.106']
INFO:root:Printing out task report.
{
  "sub_reports": [],
  "exception": "TaskRunFailure(\"Failed to update following devservers ['172.25.65.106']\",)",
  "is_successful": false,
  "description": "Failed to update following devservers ['172.25.65.106']",
  "arguments_used": {
    "update_devserver_list": [
      "172.25.65.106"
    ]
  },
  "task_name": "DevserverPushTask"

cc Allen to take a look of this virualenv problem.

172.27.215.248 is the only machine from above which is still offline. I'll follow up on why.
Project Member

Comment 13 by bugdroid1@chromium.org, Jan 11 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/6641de3fec1d6ce188683d8293b30737d1d043f2

commit 6641de3fec1d6ce188683d8293b30737d1d043f2
Author: Allen Li <ayatane@chromium.org>
Date: Wed Jan 11 20:59:32 2017

I filed b/34225314 to cover the machine that is still down.
Status: WontFix (was: Untriaged)
I'm going to assume that these fixes have been properly pushed by now.

Sign in to add a comment