New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 798794 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-02-01
OS: Chrome
Pri: 2
Type: Bug

Blocked on:
issue 772453



Sign in to add a comment

GPU Linux Ozone Builder flaking on bot_update step on GPU.FYI waterfall

Project Member Reported by jmad...@chromium.org, Jan 3 2018

Issue description

Past flaky failures:

https://ci.chromium.org/buildbot/chromium.gpu.fyi/GPU%20Linux%20Ozone%20Builder/15308
https://ci.chromium.org/buildbot/chromium.gpu.fyi/GPU%20Linux%20Ozone%20Builder/15287
https://ci.chromium.org/buildbot/chromium.gpu.fyi/GPU%20Linux%20Ozone%20Builder/15180

Failures seem to have the form:

===Running git checkout --force 60854de15bfeef95cb803aaab521f89352c07c3d -- ===
In directory: /b/c/b/GPU_Linux_Ozone_Builder/src
fatal: reference is not a tree: 60854de15bfeef95cb803aaab521f89352c07c3d
===Failed in 0.0 mins of git checkout --force 60854de15bfeef95cb803aaab521f89352c07c3d -- ===
Something failed: git checkout --force 60854de15bfeef95cb803aaab521f89352c07c3d -- failed with code 128 in /b/c/b/GPU_Linux_Ozone_Builder/src..
Ran 35.355741024 seconds past deadline. Aborting.
Traceback (most recent call last):
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 1252, in <module>
    sys.exit(main())
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 1236, in main
    checkout(options, git_slns, specs, revisions, step_text, shallow)
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 1142, in checkout
    gclient_output = ensure_checkout(**checkout_parameters)
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 841, in ensure_checkout
    cleanup_dir)
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 586, in git_checkouts
    cleanup_dir)
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 647, in _git_checkout
    force_solution_revision(name, url, revisions, sln_dir)
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 543, in force_solution_revision
    git('checkout', '--force', treeish, '--', cwd=cwd)
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 209, in git
    return call(*cmd, **kwargs)
  File "/b/rr/tmpWt9QJM/rw/checkout/scripts/slave/.recipe_deps/depot_tools/recipes/recipe_modules/bot_update/resources/bot_update.py", line 190, in call
    code, outval)
__main__.SubprocessFailed: git checkout --force 60854de15bfeef95cb803aaab521f89352c07c3d -- failed with code 128 in /b/c/b/GPU_Linux_Ozone_Builder/src.
step returned non-zero exit code: 1

Ken/Sunny/Frank, I'm not sure what's going on here. Can one of you help cc the right people?
 

Comment 1 by kbr@chromium.org, Jan 3 2018

Cc: chenwilliam@chromium.org
Components: Infra>Client>Chrome
This seems to be the same issue as  Issue 798765  -- at least, this build:
https://ci.chromium.org/buildbot/chromium.gpu.fyi/GPU%20Linux%20Ozone%20Builder/15308

has the same error:

Uncaught Exception: KeyError('root',)

In this case it looks like the bot attempted to check out ref 60854de15bfeef95cb803aaab521f89352c07c3d but it failed:

===Running git checkout --force 60854de15bfeef95cb803aaab521f89352c07c3d -- ===
In directory: /b/c/b/GPU_Linux_Ozone_Builder/src
fatal: reference is not a tree: 60854de15bfeef95cb803aaab521f89352c07c3d
===Failed in 0.0 mins of git checkout --force 60854de15bfeef95cb803aaab521f89352c07c3d -- ===
Something failed: git checkout --force 60854de15bfeef95cb803aaab521f89352c07c3d -- failed with code 128 in /b/c/b/GPU_Linux_Ozone_Builder/src..

This doesn't really look the same as  Issue 798765 . Could someone from the Infra team see whether there's some race condition in the bots' schedulers, where perhaps the bot's triggered against a git ref which can't yet be checked out?

#1: Correct, it isn't the same issue. As mentioned in 798765, that (less than helpful) error message is just an indication that bot_update failed.

This looks like it may be either failing to update the bot's cache of chromium/src or it's affected by gerrit replication latency (b/67501786 -> b/35998605).
Cc: hinoka@chromium.org
+cc hinoka: any thoughts here?

Comment 4 by kbr@chromium.org, Jan 3 2018

Blockedon: 772453
Blocking on an earlier report.

#4: thanks, I'd forgotten about that. Curious, though -- why blocking this on that rather than duping it?

Comment 6 by kbr@chromium.org, Jan 4 2018

Only in case we were going to continue to investigate this on the Chromium side, since the other bug's closed.

I see that some progress has been made on b/35998605 but these problems are clearly still happening. Do we need to comment on that bug that issues are still being seen? Would you or another Infra team member be willing to do that and then dupe this into  Issue 772453 ?

agable mentioned on the other bug (https://bugs.chromium.org/p/chromium/issues/detail?id=772453#c9) that there's nothing for us to continue to do. With progress being made but clearly not yet finished on the bug, I'm not sure I see much benefit to prodding that bug.
NextAction: 2018-02-01
Owner: jmad...@chromium.org
Status: Assigned (was: Untriaged)
If everyone agrees b/35998605 is responsible, I'm fine with let things sit. There's a lot of people watching that bug because of general Gerrit laggyness, and the infra people are already pretty aware of the need. I'll assign this to myself to watch for the flakes to clear up when the issue is closed, so setting a next action for a month from now.
The NextAction date has arrived: 2018-02-01
Status: Fixed (was: Assigned)
Don't see any flakes in the last 200 builds. Please re-open if flakiness re-occurs.

Sign in to add a comment