Monorail Project: gerrit Issues People Development process History Sign in
New issue
Advanced search Search tips
Starred by 4 users
Status: Released
Owner: ----
Closed: Dec 2016



Sign in to add a comment
Restarting gerrit causes data loss
Project Member Reported by thomasmu...@yahoo.com, Dec 27 2016 Back to list
*****************************************************************
*****                                                       *****
***** !!!! THIS BUG TRACKER IS FOR GERRIT CODE REVIEW !!!!  *****
*****                                                       *****
***** DO NOT SUBMIT BUGS FOR CHROME, ANDROID, CYANOGENMOD,  *****
***** INTERNAL ISSUES WITH YOUR COMPANY'S GERRIT SETUP, ETC.*****
*****                                                       *****
*****   THOSE ISSUES BELONG IN DIFFERENT ISSUE TRACKERS     *****
*****                                                       *****
*****************************************************************

Affected Version: 2.13+

What steps will reproduce the problem?
1. Create some changes in the gerrit ui.
2. restart gerrit
3. try accessing those changes from gerrit's ui

What is the expected output?
* I expect that gerrit should restart without any data loss.

What do you see instead?
* You get data loss, ie you carn't access your change any more.

Please provide any additional information below.

See these threads https://groups.google.com/forum/#!topic/repo-discuss/o80gRbwziS8 and https://groups.google.com/forum/#!topic/repo-discuss/bUmnCnQ9A20

Also we have a downstream task https://phabricator.wikimedia.org/T154205
 
For us this issue is a showstopper for rolling out 2.13 further in our organization. It is unacceptable to do a full offline reindex that takes 4 hours for each and every Gerrit restart. The system needs to be running 24/7 due to covering all timezones.
Project Member Comment 2 by thomasmu...@yahoo.com, Dec 28 2016
There should be a gerrit stop command that will put gerrit kind of into read only mode so that no one can upload changes, then gerrit should index the changes then it should stop. Same should be applied to the restart command. There should be a force-stop and force-restart command to force a restart or stop.

Please note that this is a regression that got introduced somwhere since 2.11.10. No idea if 2.12 has the same issue. Given that 2.11 does not need such pause mode for uploading changes and properly sync to the index, something else must be going wrong here. Further, the setting to enforce each write to the index to be flushed to disk immediately (commitWithin) set to zero, and still missing data in the index confirms this. There is a deeper problem here.
Project Member Comment 4 by thomasmu...@yahoo.com, Dec 28 2016
I am not sure if this https://github.com/gerrit-review/gerrit/commit/27ec2bad32b4cc81e3a7d0bdf2b502ab5930fa6b is some kind of fix for it?
Project Member Comment 6 by thomasmu...@yahoo.com, Dec 28 2016
Project Member Comment 7 by thomasmu...@yahoo.com, Dec 28 2016
Actually it looks like this https://gerrit.googlesource.com/gerrit/+/3b553fa968afa11103784db111fd84534f572775%5E%21/#F0 changes something to do with flush.
Project Member Comment 8 by ekempin@google.com, Dec 29 2016
Did you verify that this change is the culprit, e.g. by using git bisect?
Project Member Comment 9 by thomasmu...@yahoo.com, Dec 29 2016
Hi, I wasent suggesting that was the culprit, what i was saying was that it changed fetch. It dosent seem likly that is the problem as it changes fetch for the manual function.

I was only looking through the commits looking for something that looks like it could have changed something.
Project Member Comment 10 by thomasmu...@yahoo.com, Dec 29 2016
sorry if you thought i was saying this is the commit that broke it. 
Project Member Comment 11 by ekempin@google.com, Dec 29 2016
> I was only looking through the commits looking for something that looks like it > could have changed something.

OK, but can someone who is able to reproduce the problem use git bisect to identify the commit where it got broken? Then we don't need to guess anymore.
Project Member Comment 12 by thomasmu...@yahoo.com, Dec 29 2016
Hi doing this

git checkout v2.13.4

git bisect start HEAD v2.11.10 (known release that works)

results in

$ git bisect start HEAD v2.11.10
Previous HEAD position was 8140f0cb74... Update git submodules
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
Bisecting: 2958 revisions left to test after this (roughly 12 steps)
[f8321a8550a38fcc5065612fdf59d42435203c21] Merge "Don't aggressively flush objects when merging"

and it saying this commit https://gerrit.googlesource.com/gerrit/+/fce05db6e433f37aa230a55451bb168525c93493
Project Member Comment 13 by ekempin@google.com, Dec 29 2016
This is not the final result yet.
Please check a git-bisect tutorial like [1] to learn what bisect is and how to use it.

[1] https://git-scm.com/book/en/v2/Git-Tools-Debugging-with-Git#Binary-Search
Project Member Comment 14 by thomasmu...@yahoo.com, Dec 29 2016
Hi thanks, what would i look for as it will find commits some un related and some i wont know if they are related or not?

I did git checkout stable-2.13

then

git bisect start

git bisect bad

$ git bisect good v2.11.10
Bisecting: 2962 revisions left to test after this (roughly 12 steps)
[54d578d9adf526f9e733467116413e755498fb99] ReceiveCommits: Use BatchUpdate to mark changes merged on push


Hi all,

Bisecting brings us to this commit:

2e382534c0b96d48e9ec6a257bb692c332e1d302 is the first bad commit
commit 2e382534c0b96d48e9ec6a257bb692c332e1d302
Author: Hugo Arès <hugo.ares@ericsson.com>
Date:   Thu Sep 22 17:43:15 2016 +0200

    Revert "Drain executor of index change requests before closing index"

    This reverts commit 804f1d9c19d600e5cd94fd554360421d415b5f4a.

    The intention was to drain and close the executor before closing the
    index but closing an index doesn't necessarily means that executor
    should be closed. After online reindexing is completed, the old index
    version is closed which closed the executor and make any subsequent
    Lucene interaction fails.

    Bug:  Issue 4618 
    Change-Id: I6ec90eb73312008714aa790308e806a0134a124e

:040000 040000 946c061c8235ab034e3b1bc91f0e0e3aed53c67e e00f0db44011d0940b6fa7719605a78235dadca4 M      gerrit-lucene
Project Member Comment 16 by thomasmu...@yahoo.com, Dec 29 2016
ah thank you so much, reverting https://github.com/gerrit-review/gerrit/commit/2e382534c0b96d48e9ec6a257bb692c332e1d302 fixed it for me.
Labels: FixedIn-2.13.5
Status: Submitted
https://gerrit-review.googlesource.com/#/c/93479/
Status: Released
Project Member Comment 19 by icee...@googlemail.com, Jan 9 2017
See https://bugs.chromium.org/p/gerrit/issues/detail?id=4919 which seems to be the same as well.
Comment 20 Deleted
Comment 21 Deleted
Sign in to add a comment