New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 851703 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: ----



Sign in to add a comment

Bot update failing on chromium.gpu/Mac Release (Intel)

Project Member Reported by sheriff-...@appspot.gserviceaccount.com, Jun 11 2018

Issue description

Filed by sheriff-o-matic@appspot.gserviceaccount.com on behalf of mpearson@google.com

This bot has been failing for ~5 hours on bot_update step.

Uncaught Exception failing on chromium.gpu/Mac Release (Intel)

Builders failed on: 
- Mac Release (Intel): 
  https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29

---
===Running git checkout --force e8747d07b40eaa2b6c853653405fb709f00db05d -- ===
In directory: /b/s/w/ir/cache/builder/src
===Failed in 0.2 mins of git checkout --force e8747d07b40eaa2b6c853653405fb709f00db05d -- ===
Something failed: git checkout --force e8747d07b40eaa2b6c853653405fb709f00db05d -- failed with code 255 in /b/s/w/ir/cache/builder/src..
Traceback (most recent call last):
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 1290, in <module>
    sys.exit(main())
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 1274, in main
    checkout(options, git_slns, specs, revisions, step_text, shallow)
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 1184, in checkout
    gclient_output = ensure_checkout(**checkout_parameters)
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 893, in ensure_checkout
    git_checkouts(solutions, revisions, shallow, refs, git_cache_dir, cleanup_dir)
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 647, in git_checkouts
    cleanup_dir)
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 744, in _git_checkout
    force_solution_revision(name, url, revisions, sln_dir)
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 592, in force_solution_revision
    git('checkout', '--force', treeish, '--', cwd=cwd)
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 217, in git
    return call(*cmd, **kwargs)
  File "/b/s/w/ir/kitchen-checkout/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 198, in call
    code, outval)
__main__.SubprocessFailed: git checkout --force e8747d07b40eaa2b6c853653405fb709f00db05d -- failed with code 255 in /b/s/w/ir/cache/builder/src.
step returned non-zero exit code: 1
---

 

Comment 1 by hzl@chromium.org, Jun 11 2018

Labels: -Infra-Troopers Infra-DX

Comment 2 by hzl@chromium.org, Jun 11 2018

Using infra-DX label, instead of infra-Troopers since bot_update script owned by devX.

Comment 3 by fdoray@google.com, Jun 14 2018

Summary: Bot update failing on chromium.gpu/Mac Release (Intel) (was: Uncaught Exception failing on chromium.gpu/Mac Release (Intel))

Comment 4 by no...@chromium.org, Jun 14 2018

Components: Infra>SDK

Comment 5 by fdoray@chromium.org, Jun 14 2018

Components: Infra
Labels: -Pri-2 Infra-Troopers Pri-1
Can a trooper look at this? The bot has not been running tests for 20 hours.

Comment 6 Deleted

Comment 7 by no...@chromium.org, Jun 14 2018

Owner: hzl@chromium.org
Status: Assigned (was: Available)
assigning to today CCI trooper
Bot's checkout looks weirdly borked, so I am diving into the bot.

Paused scheduling this builder https://luci-scheduler.appspot.com/jobs/chromium/Mac%20Release%20(Intel)

Looking in git cache dir:
$ cd /b/s/c/named/git/chromium.googlesource.com-chromium-src
$ git cat-file -e 0f41e108e2c6908a4947a6bcfbb7022467544e96 || echo "not exists"
$
(so revision exists in cache)

(see also log https://logs.chromium.org/v/?s=chromium%2Fbuildbucket%2Fcr-buildbucket.appspot.com%2F8943718674024391472%2F%2B%2Fsteps%2Fbot_update%2F0%2Fstdout)
git checkout takes more than 3 minutes to checkout files. I dunno why, and I don't want to wait any longer. So, I rm --rf the checkout and unpaused the builder, let's see how https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/90584 goes

Follow up for bot_update:
perhaps before running 
  git checkout --force <rev>
it can first  do:
  git cat-file -e <rev>
and fail immediately if return code is not 0 (ie revision is invalid).
Otherwise, continue with git checkout.
Then, if git checkout fails, since we know revision was valid, we should consider a) wiping out checkout and maybe even b) whiping out cache dir.
If  https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/90585 fails the same way, then trooper should swap the bot to another one from and file a ticket to labs to re-image this one (maybe after looking at system logs).
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20Release%20%28Intel%29/90585 also failed at bot_update step. Per comment #12, a trooper should "swap the bot to another one from and file a ticket to labs to re-image this one".
Cc: dpranke@chromium.org hzl@chromium.org
Components: -Infra>SDK Infra>Client>Chrome
Labels: -Sheriff-Chromium -Infra-DX -DevX-Troopers Foundation-Troopers Infra-Troopers
Owner: tandrii@chromium.org
The builder appears to have gone from mostly broken to totally broken :(.

tandrii@, can you take this over, since it looks like you might've been the one that borked it?
Project Member

Comment 15 by bugdroid1@chromium.org, Jun 15 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/97f3111d9839789b7191f4ce47f7eaef31da249a

commit 97f3111d9839789b7191f4ce47f7eaef31da249a
Author: Andrii Shyshkalov <tandrii@chromium.org>
Date: Fri Jun 15 17:28:26 2018

Status: Fixed (was: Assigned)
Fix took 1 minute to write CL and 10 minutes to take effect :)
Yay, the power of restart-less changes of machines.

Follow up -- figuring out what's up with vm118-m9 is in  issue 853296 
Ha, I knew it wasn't me who "borked it" -- it was bad SSD (issue 853300)
Project Member

Comment 18 by bugdroid1@chromium.org, Jun 15 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infradata/config/+/619b1800ded52f0b8e4d44dbdcfd2fdef07730de

commit 619b1800ded52f0b8e4d44dbdcfd2fdef07730de
Author: Andrii Shyshkalov <tandrii@google.com>
Date: Fri Jun 15 20:10:47 2018

Labels: -Infra-Troopers
Components: -Infra>Client>Chrome Infra>Platform

Sign in to add a comment