Monorail Project: gerrit Issues People Development process History Sign in
New issue
Advanced search Search tips
Issue 5200 Restarting gerrit causes data loss
Starred by 4 users Reported by thomasmu...@yahoo.com, Dec 27 Back to list
Status: Released
Owner: ----
Closed: Dec 31



Sign in to add a comment
*****************************************************************
*****                                                       *****
***** !!!! THIS BUG TRACKER IS FOR GERRIT CODE REVIEW !!!!  *****
*****                                                       *****
***** DO NOT SUBMIT BUGS FOR CHROME, ANDROID, CYANOGENMOD,  *****
***** INTERNAL ISSUES WITH YOUR COMPANY'S GERRIT SETUP, ETC.*****
*****                                                       *****
*****   THOSE ISSUES BELONG IN DIFFERENT ISSUE TRACKERS     *****
*****                                                       *****
*****************************************************************

Affected Version: 2.13+

What steps will reproduce the problem?
1. Create some changes in the gerrit ui.
2. restart gerrit
3. try accessing those changes from gerrit's ui

What is the expected output?
* I expect that gerrit should restart without any data loss.

What do you see instead?
* You get data loss, ie you carn't access your change any more.

Please provide any additional information below.

See these threads https://groups.google.com/forum/#!topic/repo-discuss/o80gRbwziS8 and https://groups.google.com/forum/#!topic/repo-discuss/bUmnCnQ9A20

Also we have a downstream task https://phabricator.wikimedia.org/T154205
 
For us this issue is a showstopper for rolling out 2.13 further in our organization. It is unacceptable to do a full offline reindex that takes 4 hours for each and every Gerrit restart. The system needs to be running 24/7 due to covering all timezones.
There should be a gerrit stop command that will put gerrit kind of into read only mode so that no one can upload changes, then gerrit should index the changes then it should stop. Same should be applied to the restart command. There should be a force-stop and force-restart command to force a restart or stop.

Please note that this is a regression that got introduced somwhere since 2.11.10. No idea if 2.12 has the same issue. Given that 2.11 does not need such pause mode for uploading changes and properly sync to the index, something else must be going wrong here. Further, the setting to enforce each write to the index to be flushed to disk immediately (commitWithin) set to zero, and still missing data in the index confirms this. There is a deeper problem here.
Actually it looks like this https://gerrit.googlesource.com/gerrit/+/3b553fa968afa11103784db111fd84534f572775%5E%21/#F0 changes something to do with flush.
Project Member Comment 8 by ekempin@google.com, Dec 29
Did you verify that this change is the culprit, e.g. by using git bisect?
Hi, I wasent suggesting that was the culprit, what i was saying was that it changed fetch. It dosent seem likly that is the problem as it changes fetch for the manual function.

I was only looking through the commits looking for something that looks like it could have changed something.
sorry if you thought i was saying this is the commit that broke it. 
Project Member Comment 11 by ekempin@google.com, Dec 29
> I was only looking through the commits looking for something that looks like it > could have changed something.

OK, but can someone who is able to reproduce the problem use git bisect to identify the commit where it got broken? Then we don't need to guess anymore.
Hi doing this

git checkout v2.13.4

git bisect start HEAD v2.11.10 (known release that works)

results in

$ git bisect start HEAD v2.11.10
Previous HEAD position was 8140f0cb74... Update git submodules
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
Bisecting: 2958 revisions left to test after this (roughly 12 steps)
[f8321a8550a38fcc5065612fdf59d42435203c21] Merge "Don't aggressively flush objects when merging"

and it saying this commit https://gerrit.googlesource.com/gerrit/+/fce05db6e433f37aa230a55451bb168525c93493
Project Member Comment 13 by ekempin@google.com, Dec 29
This is not the final result yet.
Please check a git-bisect tutorial like [1] to learn what bisect is and how to use it.

[1] https://git-scm.com/book/en/v2/Git-Tools-Debugging-with-Git#Binary-Search
Hi thanks, what would i look for as it will find commits some un related and some i wont know if they are related or not?

I did git checkout stable-2.13

then

git bisect start

git bisect bad

$ git bisect good v2.11.10
Bisecting: 2962 revisions left to test after this (roughly 12 steps)
[54d578d9adf526f9e733467116413e755498fb99] ReceiveCommits: Use BatchUpdate to mark changes merged on push


Hi all,

Bisecting brings us to this commit:

2e382534c0b96d48e9ec6a257bb692c332e1d302 is the first bad commit
commit 2e382534c0b96d48e9ec6a257bb692c332e1d302
Author: Hugo Arès <hugo.ares@ericsson.com>
Date:   Thu Sep 22 17:43:15 2016 +0200

    Revert "Drain executor of index change requests before closing index"

    This reverts commit 804f1d9c19d600e5cd94fd554360421d415b5f4a.

    The intention was to drain and close the executor before closing the
    index but closing an index doesn't necessarily means that executor
    should be closed. After online reindexing is completed, the old index
    version is closed which closed the executor and make any subsequent
    Lucene interaction fails.

    Bug:  Issue 4618 
    Change-Id: I6ec90eb73312008714aa790308e806a0134a124e

:040000 040000 946c061c8235ab034e3b1bc91f0e0e3aed53c67e e00f0db44011d0940b6fa7719605a78235dadca4 M      gerrit-lucene
Labels: FixedIn-2.13.5
Status: Submitted
https://gerrit-review.googlesource.com/#/c/93479/
Status: Released
Project Member Comment 19 by icee...@googlemail.com, Jan 9
See https://bugs.chromium.org/p/gerrit/issues/detail?id=4919 which seems to be the same as well.
Comment 20 Deleted
Comment 21 Deleted
Sign in to add a comment