Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git backups: can I copy a git bare repository while it's being pushed to?

Tags:

git

At our firm, we're experimenting with moving from svn to git. We want to make this simple for teams, while not burdening the sysadmins too much.

We've found a way to do this by making a bare repository on a (Windows) network drive that each team has, and push/pull to/from that. Authentication is arranged through file access permissions, so no need to set up https and the whole auth stuff. Great! (and we can access the drive remotely via VPN, so it's really almost as good as a https or git+ssh solution)

Better yet, we'll even get backups for free, because the network share is already being backed up. However, this backup runs rather unpredictably (backup up lasts several hours, so might continue onto the next working day).

Thus, it is possible that the drive is being backed up while a developer is pushing to the repository. With SVN, this could cause problems, which is why svn hotcopy exists.

Does the same risk exist with git? Can I copy a bare repository somewhere while someone is pushing to it? Naturally, it's all right if the push-being-done cannot be restored. It's also fine if some work has to be done to restore a backup that was made while it was pushed to (i.e. by removing the half-done push residue data). But if the entire bare repository becomes broken and unusable, then that's a problem.

I've done some experiments and couldn't see problems, but this does not mean that there can't be any.

Edit: I accepted a 'do it the right way' answer, because that's what I intend doing in the long run. For now, however, for us a simple solution has been to git clone the entire bare repository (onto the same drive) about an hour before the automated backup kicks in. The automated backup may incorrectly copy the "real" repository if it has been in use at that point, but it will not have trouble with the recently cloned copy. We know when the backup starts, just not when it ends, so that's good enough for us.

like image 940
skrebbel Avatar asked Jun 22 '12 10:06

skrebbel


People also ask

How do I backup my entire Git repository?

The correct answer IMO is git clone --mirror. This will fully backup your repo. Git clone mirror will clone the entire repository, notes, heads, refs, etc. and is typically used to copy an entire repository to a new git server.

How to keep a backup of your repository?

This is a very basic way of keeping a backup of your repo. cp -r myrepo backup_copytar -czf backup_copy.tgz myrepo But such a frozen copy can not be updated. git bundle¶ Git bundle let you pack the references of your repository as a single file, but unlike the tar command above the bundle is a recognized git source.

Why do Git central repositories use bare repositories?

Central Repositories use bare repositories only because git doesn’t allow you to push to a non-bare repository as the working tree will become inconsistent. Enumerating objects: 3, done. Counting objects: 100% (3/3), done. Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), 293 bytes | 146.00 KiB/s, done.

What is the default repository in Git?

It stores the hashes of commits made in the branches and a file where the hash of the latest commit is stored. As you can see, the .git folder contains all the required files for tracking the project folder. The default repository is always used for local repositories. What is a bare repository?


1 Answers

It may be worth changing your backup policy to ignore backing up the whole Git repository and instead backup a Git bundle. From Git's Little Bundle of Joy:

The bundle command will package up everything that would normally be pushed over the wire with a git push command into a binary file that you can email or sneakernet around, then unbundle into another repository.

This approach is also discussed in Backup of github repo and Backup a Local Git Repository.

A quick test of a local repo reveals the following creates a single file that contains everything one would typically want in a full repo backup:

$ git bundle create ../my.bundle --all

Creating a clone from the bundle file is simply:

$ git clone my.bundle my-repo

Using git ls-remote my.bundle shows that all tags and branches are in the bundle.

However, to backup things that are probably not in the bundle file (like configuration, hooks, grafts, alternates, etc.), I would take the backup a few steps further and backup the Git repository (short the objects, refs and logs directories) and the bundle file (the contents of the objects and refs repository directories are in the bundle and not needed). Unless the bundle does contain these files; then you only need to backup the bundle.

like image 64
Dan Cruz Avatar answered Sep 18 '22 08:09

Dan Cruz