How do you organise multiple git repositories, so that all of them are backed up together?

Question

With SVN, I had a single big repository I kept on a server, and checked-out on a few machines. This was a pretty good backup system, and allowed me easily work on any of the machines. I could checkout a specific project, commit and it updated the 'master' project, or I could checkout the entire thing.

Now, I have a bunch of git repositories, for various projects, several of which are on github. I also have the SVN repository I mentioned, imported via the git-svn command..

Basically, I like having all my code (not just projects, but random snippets and scripts, some things like my CV, articles I've written, websites I've made and so on) in one big repository I can easily clone onto remote machines, or memory-sticks/harddrives as backup.

The problem is, since it's a private repository, and git doesn't allow checking out of a specific folder (that I could push to github as a separate project, but have the changes appear in both the master-repo, and the sub-repos)

I could use the git submodule system, but it doesn't act how I want it too (submodules are pointers to other repositories, and don't really contain the actual code, so it's useless for backup)

Currently I have a folder of git-repos (for example, ~/code_projects/proj1/.git/ ~/code_projects/proj2/.git/), and after doing changes to proj1 I do git push github, then I copy the files into ~/Documents/code/python/projects/proj1/ and do a single commit (instead of the numerous ones in the individual repos). Then do git push backupdrive1, git push mymemorystick etc

So, the question: How do your personal code and projects with git repositories, and keep them synced and backed-up?

Damien Diederen · Accepted Answer

I would strongly advise against putting unrelated data in a given Git repository. The overhead of creating new repositories is quite low, and that is a feature that makes it possible to keep different lineages completely separate.

Fighting that idea means ending up with unnecessarily tangled history, which renders administration more difficult and--more importantly--"archeology" tools less useful because of the resulting dilution. Also, as you mentioned, Git assumes that the "unit of cloning" is the repository, and practically has to do so because of its distributed nature.

One solution is to keep every project/package/etc. as its own bare repository (i.e., without working tree) under a blessed hierarchy, like:

/repos/a.git /repos/b.git /repos/c.git

Once a few conventions have been established, it becomes trivial to apply administrative operations (backup, packing, web publishing) to the complete hierarchy, which serves a role not entirely dissimilar to "monolithic" SVN repositories. Working with these repositories also becomes somewhat similar to SVN workflows, with the addition that one can use local commits and branches:

svn checkout   --> git clone svn update     --> git pull svn commit     --> git push

You can have multiple remotes in each working clone, for the ease of synchronizing between the multiple parties:

$ cd ~/dev $ git clone /repos/foo.git       # or the one from github, ... $ cd foo $ git remote add github ... $ git remote add memorystick ...

You can then fetch/pull from each of the "sources", work and commit locally, and then push ("backup") to each of these remotes when you are ready with something like (note how that pushes the same commits and history to each of the remotes!):

$ for remote in origin github memorystick; do git push $remote; done

The easiest way to turn an existing working repository ~/dev/foo into such a bare repository is probably:

$ cd ~/dev $ git clone --bare foo /repos/foo.git $ mv foo foo.old $ git clone /repos/foo.git

which is mostly equivalent to a svn import--but does not throw the existing, "local" history away.

Note: submodules are a mechanism to include shared related lineages, so I indeed wouldn't consider them an appropriate tool for the problem you are trying to solve.

imz -- Ivan Zakharyaschev · Answer

I want to add to Damien's answer where he recommends:

$ for remote in origin github memorystick; do git push $remote; done

You can set up a special remote to push to all the individual real remotes with 1 command; I found it at http://marc.info/?l=git&m=116231242118202&w=2:

So for "git push" (where it makes sense to push the same branches multiple times), you can actually do what I do:
.git/config contains:
[remote "all"] url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 url = login.osdl.org:linux-2.6.git 
and now git push all master will push the "master" branch to both
of those remote repositories.

You can also save yourself typing the URLs twice by using the contruction:

[url "<actual url base>"]     insteadOf = <other url base>

How do you organise multiple git repositories, so that all of them are backed up together?

Tags:

git

backup

dbr

2 Answers

Damien Diederen

imz -- Ivan Zakharyaschev

Recent Activity

Donate For Us

How do you organise multiple git repositories, so that all of them are backed up together?

Tags:

git

backup

dbr

2 Answers

Damien Diederen

imz -- Ivan Zakharyaschev

Related questions

Recent Activity

Donate For Us