New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 715306 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Mar 2018
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocking:
issue 729691
issue 753076



Sign in to add a comment

use a 3-way push for automated_deploy, to avoid the problem of lost git hash ranges from failed deployments

Project Member Reported by akes...@chromium.org, Apr 25 2017

Issue description

Today in automated deploy we

1) git push prod-next prod (+ print resulting range?)
2) run deploy script which deploys |prod| and takes a while and affects a bunch of servers
3) re-print again the output from 1


If #2 fails in any way, which it often does, we lose the output from 1. Big hassle.


Improvement
1) git push prod-next prod-pushing
2) deploy prod-pushing to everything
3) git push prod-pushing prod -> print this commit range.

This way we never lose the pushed range if we fail in #2.
 
Labels: -current-issue
Another easy way to fix this issue is just to write the push logs to a template file, and logging out the path to the file at end of the deploy no matter fail or succeed. 

The big con of the approach mentioned above is that it requires to handle 3*N branches during push, N is the number of git repos. If we add devserver update to push-to-prod later, N will be 3. Then we have 9 branches. 

I prefer the first approach, since even with the second approach, the developer still needs to manually run several commands to get the pushed commits. E.g,
cd ~/autotest
git log xxx...bbb


> I prefer the first approach

Which one is the first approach?
Save only the push logs to a template file
> even with the second approach, the developer still needs to manually run several commands

Why? The deploy script can run the git log command at the end, right after doing the `git push prod-pushing prod` to determine pushed range. The likelihood of failure then becomes very low, and the flow is fully automated.
Hmm, the workflow is
1. push prod to prod-next (in automated_deploy script)
2. Run deploy
3. log pushed commits using push prod-pushing...prod
4. push prod-pushing to prod (in deploy_server script)

Yeah, this is do-able, but this needs to create new branches every time we want to add new repos to push-to-prod and change the related code. I think saving the pushed commits to a file can save all the effort here and also fix the issue. Is there any concern about this easy approach? 



> but this needs to create new branches every time we want to add new repos to push-to-prod

Creating a branch is easy, takes a few clicks.

> I think saving the pushed commits to a file can save all the effort here and also fix the issue. Is there any concern about this easy approach?

That's not really different from the current approach of saving a log. What if I rerun the script without checking the log because I don't know about it? What if I get halfway though but give up, and then tomorrow a different deputy tries to deploy and doesn't have my log?
> That's not really different from the current approach of
> saving a log. What if I rerun the script without checking
> the log because I don't know about it? What if I get halfway
> though but give up, and then tomorrow a different deputy
> tries to deploy and doesn't have my log?

+1.  I would very much like this process to be more robust to
various sorts of failures.  That means we need some way to roll
back/retry if the deployment script fails.

Another example of this behaving badly. automated_deploy bailed out because there were no new chromite changes (but there were new autotest changes). If I just re-run the script now, it has already pushed autotest and will claim no new autotest changes.

$ ./automated_deploy.py | tee ../deploy.log
INFO:root:Cloning git repo https://chromium.googlesource.com/chromiumos/third_party/autotest
DEBUG:root:Running '/usr/bin/git --git-dir=/tmp/autotest/.git clone https://chromium.googlesource.com/chromiumos/third_party/autotest /tmp/autotest -b prod'
INFO:root:
INFO:root:Cloning git repo https://chromium.googlesource.com/chromiumos/chromite
DEBUG:root:Running '/usr/bin/git --git-dir=/tmp/chromite/.git clone https://chromium.googlesource.com/chromiumos/chromite /tmp/chromite -b prod'
INFO:root:
Traceback (most recent call last):
  File "./automated_deploy.py", line 194, in <module>
Cloning autotest prod branch under /tmp/autotest
Running command: rm -rf /tmp/autotest
Successfully cloned autotest prod branch
Updating autotest prod branch.
Running command: git rebase origin/prod-next prod
First, rewinding head to replay your work on top of it...
Fast-forwarded prod to origin/prod-next.
Running command: git push origin prod
remote: Processing changes: done            
To https://chromium.googlesource.com/chromiumos/third_party/autotest
   d2359125e..c48be08d9  prod -> prod
Successfully pushed autotest prod branch!

Getting pushed CLs for autotest repo.
Running command: git log --oneline d2359125e..c48be08d9|grep autotest
6970014cb autotest: clean tree during deploy_server_local
5f45b2873 autotest: silently fail PIL import in chameleon.py
Successfully got pushed CLs for autotest repo!

Cloning chromite prod branch under /tmp/chromite
Running command: rm -rf /tmp/chromite
Successfully cloned chromite prod branch
Updating chromite prod branch.
Running command: git rebase origin/prod-next prod
Current branch prod is up to date.
Running command: git push origin prod
Everything up-to-date
Successfully pushed chromite prod branch!

Deploy fails with error:
Fail to get pushed commits for repo chromite from git push log: Everything up-to-date

Push log:

autotest:
git log --oneline d2359125e..c48be08d9|grep autotest
6970014cb autotest: clean tree during deploy_server_local
5f45b2873 autotest: silently fail PIL import in chameleon.py



    sys.exit(main(sys.argv))
  File "./automated_deploy.py", line 173, in main
    repo, repo_dir, hash_to_rebase)
  File "./automated_deploy.py", line 118, in update_prod_branch
    (repo, result))
__main__.AutoDeployException: Fail to get pushed commits for repo chromite from git push log: Everything up-to-date

Project Member

Comment 10 by bugdroid1@chromium.org, May 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/673519bedaed088c078d9dab5b30e0fc3b7b55e9

commit 673519bedaed088c078d9dab5b30e0fc3b7b55e9
Author: Shuqian Zhao <shuqianz@chromium.org>
Date: Mon May 08 19:41:03 2017

autotest: Skip updating the repo when it is already up-to-date

Currently, the server deployment will fail when either autotest or
chromite repo is already up-to-date. Add a check before updating the
repo to see whether it is already up-to-date. If yes, skip it.

BUG= chromium:715306 
TEST=unittest

Change-Id: I68ca235cabb961663d6fce95dd10c43e23520f8b
Reviewed-on: https://chromium-review.googlesource.com/497593
Commit-Ready: Shuqian Zhao <shuqianz@chromium.org>
Tested-by: Shuqian Zhao <shuqianz@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/673519bedaed088c078d9dab5b30e0fc3b7b55e9/site_utils/automated_deploy.py
[modify] https://crrev.com/673519bedaed088c078d9dab5b30e0fc3b7b55e9/site_utils/automated_deploy_unittest.py

Blocking: 729691
Isn't the reflog a much simpler way to view the change to the ref?
yeah, but still needs branches to point to the different place we are in during pushing, so that reflog can be clear
RE #12 reflog on what machine? The problem is there is no single source of truth at the moment on what should be considered (previous, in progress, finished) push hash, because no single machine can know this.
What do you mean by "different place"? If I understand correctly, we just want to remember the previous value of the "prod" branch, right?
yes, like aviv comments, we need branches to record commits since no single machine can know this.
Are you hypothesizing multiple developer desktops would be deploying, and only the first would have the correct reflog entries?
Yes. Deputy A tries to push. Push fails in the middle and deputy gives up. Next day, deputy B tries to push.

This happens often.
Blocking: 753076
Status: Fixed (was: Assigned)
This was fixed a while ago.

Sign in to add a comment