Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I push and automatically update working copy of my non-bare remote git repository?

Tags:

git

See also this related question, it has most current answer: https://stackoverflow.com/a/28262104/7918.

How can I push and automatically update working copy of my non-bare remote git repository?

Before git people will say that "this is unsafe", please let me explain why in my case this would work. I have repository foo in on machines A and B. A happens to be my local machine, B is on a supercomputing grid. Project is normally developed on A (with some testing) then pushed to B for more testing and to submit jobs to the grid. Both repositories have working directories. Repositories both on A and B are private and tied to my user account.

For now git flow is like that:

  1. Commit on A
  2. Push on A
  3. Fetch on B
  4. Merge on B

I'd rather omit steps 3 and 4. That is: I'd like to have remote non-bare repository updated on push, if changes from A can be automatically safely merged (that is: when merge is fast-forward, and working dir on B is clean).

First solution would be to drop repository on B and just use rsync to sync code, but this is undesirable as sometimes I do some changes directly on B, and don't want these changes to be overriden easily.

Second soultion would be to install this patch, which even got merged to mysysgit (but not git proper). This patch adds updateInstead value to the receive.denyCurrentBranch git option. This is viable, but I'd rather not patch git on many machines.

Third solution (taken from here) would involve having three git repositories: A, B', B. Where B' is bare and uses hooks to sync A and B. This is really straightforward, but I guess having three repositories would likely increase fragility of whole system.

Last solution would be preffered: that is to use hooks on repository B to automatically merge pushed changes to working copy. Everyone says it is straightforward to do this, but I guess that my knowledge of git internals is to weak to patch something sensible. I did some work on this but nothing that actually works.

What I'm looking for:

  • Hook-based solution that would automatically merge remote non-bare repo on push. That does not: require me to recompile git, and create additional repositories.
like image 621
jb. Avatar asked Aug 08 '14 16:08

jb.


2 Answers

The difficult part is not having machine B update, but rather defining precisely how you want B to update. For instance:

  • What if on machine B, the branch that is checked out is test3, and I push to its branch test2? Should machine B's working copy be changed at all?
  • What if on machine B, the branch that is checked out is deploy, but someone (perhaps even me) on B is actively editing work-tree files, and I push to B's deploy. Should it wipe out what I'm doing right now?
  • What if on machine B, the branch that is checked out is deploy, but I've made some changes there and checked them in, and now I make changes on A and force-push to deploy and there would be a merge conflict if this were a real merge? (In fact, merge does not really apply on push, as I'll describe in a bit more detail below.)

These questions rarely have a single right answer. That's why git push has the receive.denyCurrentBranch option in the first place: if the answer to the first question above is assumed to be no (it usually is no), then only updating the currently-checked-out branch raises the remaining questions. If we deny the ability to do that, why then, all those questions vanish and we don't have to think hard about the answers! :-)

There's a simple way to sidestep all of this, which is to have a bare repository on the receiving machine, and then what you might call a "bare work tree" (no .git directory in it) somewhere else on that same machine. That way there's no direct notion of "current branch" in the first place (although it sneaks back in through the back door, as it were).

There's a fundamental asymmetry here in git, in that when you git fetch from a remote, you get commits and other objects from them, stuff those in your repository, and then update your remote branches. After git fetch origin you may have a new origin/master, but you do not have a new master. This gives you a stopping point, an intermediate step, during which you get to pause, rest a bit, look at what's just come in, and decide whether and how to rebase or merge the changes.

When you git push c0ffee3:master to a remote, however, you send your commits and other objects over and they (the remote) stuff the objects into their repository, and then they update their branch master to your commit (which is also now their commit) whose ID is c0ffee3. There's no pause for evaluation; there's no chance to rebase or merge; you've replaced their master with your c0ffee3. For that matter, your c0ffee3 does not have to be your master at all. Any suitable repository object—that's any commit ID or any annotated tag ID—is sufficient if you force-push (provided there's no fancy remote hook to deny you).

All that said, though, let's go back to the "bare work tree" idea. Here, on machine B—let's stop calling this "the remote" now, and just say "here on B"—we'll have a bare repository so that we can take incoming pushes regardless of what git may think is the "current branch".

Next, we'll answer the "what if" questions with this: *whenever we receive anything new for some branch(es), we'll completely blow away whatever we had before, no matter what we're in the middle of doing with it, and replace it with new stuff based on what we now believe to be in that branch or those branches."

(Is that really the right answer? What if we're in the middle of compiling or testing? Well, we claimed it was the right answer; onward.)

What we'll do here on B, then, is set up our --bare repository with a hook—this can be the post-update hook or the post-receive hook—that runs after some branch(es) is/are updated. Both "post" hooks are run just once per receive (basically once for each push), and given a list of all updates. The post-update hook gets all updated ref-names as arguments, while the post-receive hook gets all updated refs, including both old and new SHA-1s, on stdin.

(The complexity here is that in one push, I can update more than one branch and/or tag. For instance, with git push c0ffee3:master badf00d:refs/tags/new-tag, I can tell you to update your master branch to make it point to commit c0ffee3, and to create a tag pointing to object badf00d. Here, your post-update hook would get refs/heads/master refs/tags/new-tag, while your post-receive hook would be able to read two lines, roughly oldsha1 c0ffee3 refs/heads/master and then 0000000 badf00d refs/tags/new-tag, from stdin. These would all be full 40-character SHA-1s of course.)

Because we've decided that we'll just blow away the "bare work tree", all we have to do in this hook is find out if an interesting branch has been updated. Let's say we care specifically (and only) about a branch named develop, i.e., the ref-name refs/heads/develop. Then in a post-receive hook written as a shell script, our stdin scan loop might look like this:

do_update=false
while read oldsha newsha ref; do
    [ $ref = refs/heads/develop ] && do_update=true
done

In a post-update hook, we would just check arguments:

do_update=false
for ref do
    [ $ref = refs/heads/develop ] && do_update=true
done

Either way, if we see that the interesting branch has changed, we now want to do the blow-away-and-rebuild step:

blow_away_and_rebuild()
{
    local target_dir=$1 branch=$2

    rm -rf $target_dir
    mkdir $target_dir
    GIT_WORK_TREE=$target_dir git checkout -f $target_dir
}

if $do_update; then
    blow_away_and_rebuild /home/me/develop.dir develop
fi
exit 0 # succeed, regardless of actual success

Note that the git checkout step above populates the (removed and re-created) "bare work tree", but also has the side effect of setting the "current branch" (and fussing with git's index). This is how "current branch" manages to sneak in even though we have a nominally bare repository. We often don't need the rm -rf step, but if you have two different branches you'll "deploy" in this fashion, it sidesteps the "single current branch = single index" model git uses, which otherwise may leave old files behind.

The other trick here is that since /home/me/develop.dir has no .git directory within it (hence "bare work tree"), I won't be fooled into going into it, checking out a branch, and starting to edit there. Of course I can still be fooled into going into it and starting to work there, but at least I won't blame git if suddenly all my work gets rm -rf-ed. :-)

like image 79
torek Avatar answered Sep 28 '22 08:09

torek


You will need to push to a different branch on the server because you can't update the currently checked out branch, then you can use a post-receive hook.

Put this stuff on a file named .git/hooks/post-update (it needs to be executable) on the server:

#!/bin/bash
if [ "$1" == "refs/heads/automerge" ] ; then
    git stash
    git merge --ff-only automerge
fi

git stash will save a snapshot of the current working tree, in case there are local uncommitted modifications, so that you can recover them later if needed (with git stash apply or git stash pop). When you run

git push origin yourbranch:automerge

on the client, the working copy on server will be updated (only if is a fast forward) to the pushed version. You can configure push so that the default pushed branch is automerge.

To check if you have local modifications not committed you can do:

if [ `git status --porcelain|wc -l` == 0] ; then
    #status is clean
fi
like image 44
pqnet Avatar answered Sep 28 '22 07:09

pqnet