Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Local shallow git cloning with hard-links

Tags:

git

clone

On my local filesystem, I want to be able to only clone the head of git repo (A) so no history comes along to a new git repo (B). But I want the benefits of hard-links for the files now in B to save on space. Is there a way to do this? Do the hard-links even help once repo A changes?

Thanks!

like image 685
Jeff Avatar asked Sep 19 '12 18:09

Jeff


People also ask

Does git work with hard links?

Git will handle a hard link like a copy of the file, except that the contents of the linked files change at the same time. Git may see changes in both files if both the original file and the hard link are in the same repository.

Can I clone a local git repository?

You can use Sourcetree, Git from the command line, or any client you like to clone your Git repository. These instructions show you how to clone your repository using Git from the terminal. From the repository, select the Clone button. Copy the clone command (either the SSH format or the HTTPS).


1 Answers

It appears that it is impossible to do local shallow cloning with hard links among object databases, at least as of git 1.7.12. git clone --depth 1 --single-branch explicitly warns that --depth is ignored in local clones, and to use file://. So you will need to choose between hard links and shallow cloning.

The hard links do work even when repository changes, at least for a time, because git adds new objects to new files, and never modifies existing files. However, it does occasionally repack the object database for efficiency, and I don't see how the hard links could be preserved then.

If you choose shallow cloning, you can create the clone with git clone --single-branch --depth 1 file://old_repo_dir options. I find it annoying that --depth 1 means 1 item of history, so you'll get not only the latest commit, but also its parent (or parents if it's a merge). The parent gets the commit message from the original repository, but the commit message lies because the commit in actuality contains the creation of the entire tree.

I prefer to start off with a single commit with a commit message of my choosing that creates the initial tree. This is obtained by first creating a new branch without history in the old repo, and then pulling that branch into the new empty repo. I tested this on a huge repo with a 664MB object database with 673k objects (the Emacs bzr repository converted to git). When the new repo received the pull, it had a 36MB object database with 3477 objects — so the excess content was apparently pruned. Here are the exact steps:

# at the old repo:
git checkout --orphan tmp-snapshot
git commit -m "Initial commit."

# at the new repo location:
git init
git pull OLD_REPO_DIR tmp-snapshot:master

# back at the old repo:
git branch -D tmp-snapshot   # no longer serves a purpose

Now the master branch of the new repo contains a single commit with a tree identical to the tree of the old repo, and without any history.

like image 124
user4815162342 Avatar answered Oct 03 '22 01:10

user4815162342