New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 816733 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

gsubtreed breaks due to missing objects

Project Member Reported by aga...@chromium.org, Feb 27 2018

Issue description

gsubtreed regularly breaks due to missing objects. This manifests as:
  processing path 'components/feedback'
  processing Ref(Repo('https://chromium.googlesource.com/chromium/src'), 'refs/heads/master')
  starting with tree git2.INVALID
  processing Commit(Repo('https://chromium.googlesource.com/chromium/src'), '1eab4e9e04c773022b3e24f251fa384396d48aa5')
  found new tree '69f9b4d814630435a8bd59b4b52cc5005ea3095c'
  <process more commits, find more trees>
  error: refs/heads/master does not point to a valid object!

When you ssh into the bot, and cd into the on-disk repo representing this subtree, a trying to "git show HEAD", "git show master", or "git show <sha1 of ToT commit on gitiles>" all result in the same invalid object error.

However, the tree object underlying that commit object still exists and can be "git shown". This makes sense, because all the subtree repos have an objects/info/alternates file pointing at the on-disk chromium-src repo, and the tree objects are reused verbatim.

Here's where things get weird:

If you ssh into the bot and cd into the on-disk repo of a subtree which is still working, you can successfully "git show HEAD" etc. BUT! If you look in the objects/ and objects/pack/ directories, the object representing that commit object is nowhere to be found. And if you cd into the chromium-src repo, you can "git show <sha1 from subtree repo>" and it *still works*. Even though the commit message is one with the "Cr-Mirrored-From" footer, and the commit never existed in upstream chromium.src.

Somehow, the commits that gsubtreed synthesizes are ending up in chromium-src's object store, not in the subtree repos'.

Hypothesis: this causes the breakage noted above due to automatic GCs that happen when the chromium-src repo does a large fetch from gitiles. It looks at all these commit objects sitting around, says "no refs point at these", and cleans some of them up. Thus causing them to cease to exist in the subtree repos, and breaking them.

Next steps:
1) See if gsubtreed is actively doing this. If so, why? If for no good reason, fix it.
2) See if git itself is doing this. If so, why? Can we build a minimal reproduction case?
3) If yes, can it be circumvented? Can it be fixed by updating the bot to a newer git?
 
That smells a lot like a plain old bug (and is quite plausible). Synthesized commits should go into the local subtree repos, not the main repo.

IIRC, it used to be the case that the local subtree repo and the main repo used to be the same repo. I don't remember when or why this changed... However, I can certainly imagine this bug being introduced at that time.

Comment 2 by aga...@chromium.org, Feb 27 2018

Looks like it's because gsubtreed uses infra.libs.git2.commit.Commit.alter() to generate the new commit object from the old one. alter() automatically interns the new commit object into the associated repo, and it doesn't overwrite the repo of the new commit, so it gets intern'd in the old origin repo.

solutions:
1) don't use .alter()
2) make .alter() more flexible

Haven't looked into either yet, but intuitively (1) makes more sense since we're *generating* a new commit, not altering an existing one.

Comment 3 by aga...@chromium.org, Feb 27 2018

Owner: aga...@chromium.org
Status: Started (was: Available)
https://chromium-review.googlesource.com/c/infra/infra/+/938635 might do it?
Project Member

Comment 4 by bugdroid1@chromium.org, Feb 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/c425e6b10d4b1f76fc3e0bd4308c2af37394579a

commit c425e6b10d4b1f76fc3e0bd4308c2af37394579a
Author: Aaron Gable <agable@chromium.org>
Date: Tue Feb 27 18:24:06 2018

gsubtreed: store synthesized commits in subtree repo

The Commit.alter() method does two things: pass through its arguments
to a CommitData.alter() call, and then intern the new CommitData
object in the existing Commit's repo. But in gsubtreed, that's not
what we want to do: we want to intern the new commit in the
subtree repo, not in the origin repo. This change splits the data
alteration and the repository intering into two steps so that they
can each happen in the right place.

Bug:  816733 
Change-Id: If7f843f35e3c255f5bec629c847e7fcf7467cc76
Reviewed-on: https://chromium-review.googlesource.com/938635
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Reviewed-by: Alan Bram <flyboy@chromium.org>
Commit-Queue: Aaron Gable <agable@chromium.org>

[modify] https://crrev.com/c425e6b10d4b1f76fc3e0bd4308c2af37394579a/infra/services/gsubtreed/gsubtreed.py

Comment 5 by aga...@chromium.org, Feb 27 2018

While watching to see if this improves things, noticed this nice helpful error message in the logs:

[I2018-02-27T18:32:47.989841+00:00 31722 139902678640448 infra.libs.git2.repo:231] Running ('git', 'fetch')
From https://chromium.googlesource.com/a/chromium/src
   82aff9dd084a..d6db2b70703b  master     -> master
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
error: The last gc run reported the following. Please correct the root cause
and remove gc.log.
Automatic cleanup will not be performed until the file is removed.
warning: There are too many unreachable loose objects; run 'git prune' to remove them.

Yeah, that seems like a Bad Thing.

Comment 6 by aga...@chromium.org, Feb 27 2018

Once I observe this working, I plan to delete all of the subtree repos from the bot (so they get re-cloned with no missing objects) and git-gc-prune the chromium-src repo on the bot so that it doesn't have all the extra objects floating around either.

Comment 7 by aga...@chromium.org, Feb 27 2018

Confirmed, commit objects are now being created in the subtree repos and *not* in the chromium-src origin repo.

Will delete things for a clean slate momentarily.

Comment 8 by aga...@chromium.org, Feb 27 2018

Status: Fixed (was: Started)
Hopefully fixed; will reopen if we see this continuing in the near future.
Did you make sure to change the deployment ref? IIRC, gsubtreed uses the infra/deployed git ref.
Yes, see PSA sent to chrome-infra@

Sign in to add a comment