Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

subrepo, hg clone and symlinks

I'm quite new to mercurial, I've read a lot on this topic but I've been unable to find a clear answer.

The mercurial guide says: "For efficiency, hardlinks are used for cloning whenever the source and destination are on the same filesystem (note this applies only to the repository data, not to the working directory)."

The Repository wiki page says: "All of the files and directories that coexist with the .hg directory in the repository root are said to live in the working directory".

Now, to "link" a subrepo in a main repo I do:

hg init main
cd main
echo subrepo = ../subrepo > .hgsub
hg clone ../subrepo subrepo           # (1)
hg add
hg ci -m "initial rev of the main repo"

Does the definition above mean that I'm actually creating a copy of subrepo when I perform (1)?? Or am I creating just a symlink to ../subrepo? According to the output of ls, it is an actual copy. But it sounds so strange to me... If someone could put a bit of light on this subject, I'd appreciate.

like image 715
user533784 Avatar asked Dec 08 '10 17:12

user533784


2 Answers

First of all, that part of Mercurial, I'm not an expert, but here's what I've understood.

No, you didn't create a link to the whole directory. Instead, files were hardlinked inside it.

This means that space on disk is reserved to keep your directory structure separate, but the files are all identical, because they were just cloned, so they are constructed as links back to the original.

When you start manipulating the repository, through your add or commit (ci) commands, then the hardlinks are broken by Mercurial and separate files are constructed for each, on demand.

Now, this is purely a technical thing, you don't need to know or care about this. If it makes it easier, just think of a clone as a complete copy of the original repository, separate files and all that. The hardlink part is just to save diskspace for the things that are the same.

Since a typical project has many files, and a typical changeset only changes a few files, and a typical reason to clone is that you're going to do a fixed set of changes, hardlinks makes sense since many of the files in the repository directories will be 100% identical to their original for the lifetime of the repository.

For those that aren't, all of that is silently handled by Mercurial for you.

like image 167
Lasse V. Karlsen Avatar answered Sep 23 '22 06:09

Lasse V. Karlsen


Let us start by looking at what happens when you clone without talking about subrepositories. When you do

$ hg clone A B

then Mercurial will make hard links for the files inside A/.hg/store/data. So if a file called x is tracked, then after the clone you will see that

A/.hg/store/data/x.i

and

B/.hg/store/data/x.i

are hard linked -- this means that the two filenames really refer to the same file. As Lasse points out, this is smart since you might never commit a change to x clone, and so there is no reason to make two different x.i files for the A and B clones. Another advantage is that it is much faster to make a hard link than to copy a file, especially if x.i is very large: the hard link is a constant time operation.

In your example above you are adding a subrepository subrepo to the main repository. A subrepository consist of two things:

  1. the subrepository itself. This what you creates when you do

    $ hg clone ../subrepo
    
  2. the subrepository meta data. This is what you store in the .hgsub file. You must tell Mercurial where you want the subrepository and where Mercurial can clone it from.

You ask if you copy or symlink the repository, and you certainly copied (cloned) it, as you have also confirmed with ls. Afterwards you added some meta data to Mercurial that tells it where it can expect to find the subrepository. This has nothing to do with a symbolic link in the normal filesystem sense, it is just some meta data for Mercurial.

like image 29
Martin Geisler Avatar answered Sep 24 '22 06:09

Martin Geisler