Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Backing up a mercurial repository while preserving timestamps

Is there a way to back up a mercurial repository while preserving the files' timestamps?

Right now, I'm using hg clone to copy the repository to a staging directory, and the backup program picks up the files from there. I'm not pointing the backup program directly at the repository because I don't want it to be changing (from commits) while the backup is happening.

The problem is that hg clone changes all the files' timestamps to the current time, so the backup program (which I cannot change) thinks everything has been modified.

like image 368
Jim Hunziker Avatar asked Nov 24 '09 19:11

Jim Hunziker


3 Answers

Plan A: When the source and destination directories reside on the same file system, hg clone -U would simply hardlink all its files in the repository, without changing timestamps. This approach is quite fast and always safe (files are unlinked lazily when written to).

If you need to, you can clone on the same file system first, and then rsync this new clone over to another file system.

Plan B: It's usually safe to use rsync or some other file-based synchronization tool. Mercurial doesn't store anything magical on disk, just plain files.

There is a race condition, when you happen to commit to this repository at the same time when rsync is running, but I think it's negligible because a "hg rollback" should be able to clean up your such inconsistencies if you restore from a broken backup. Do note, that rollback cannot recover if you had multiple separate transactions (such as multiple "push" or "commit" commands) in the rsync window, or run destructive operations that tamper with history (such as rebase, hg strip, and some MQ commands).

like image 168
intgr Avatar answered Oct 05 '22 16:10

intgr


I suggest using hg pull instead of hg clone. So you'll keep a mirror of the repository on your server and update it periodically with hg pull. You then let your backup program take a backup of that. When you use hg pull you will transfer the newest history and only changed files under .hg/store/data which were actually effected by the pull.

Here I tested this by making a small repo with two files: a.txt and b.txt. I then cloned the repository "to the server" using hg clone --noupdate. That ensures that we have no working copy on the server -- it only needs the history found in .hg.

The timestamps looked like this after the clone:

% ll --time-style=full .hg/store/data
total 8.0K
-rw-r--r-- 1 mg mg 76 2009-11-25 20:07:52.000000000 +0100 a.txt.i
-rw-r--r-- 1 mg mg 69 2009-11-25 20:07:52.000000000 +0100 b.txt.i

As you noted, they are all identical since the files were all just created by the clone operation. I then changed the original repository (the one on the client) and made a commit. After pulling the changeset I got these timestamps:

% ll --time-style=full .hg/store/data
total 8.0K
-rw-r--r-- 1 mg mg 159 2009-11-25 20:08:47.000000000 +0100 a.txt.i
-rw-r--r-- 1 mg mg  69 2009-11-25 20:07:52.000000000 +0100 b.txt.i

Notice how the timestamp for a.txt.i has been updated (I only touched a.txt in my commit) while the timestamp for b.txt.i has been left alone.

If your backup software is smart, it will even notice that Mercurial has only appended data to a.txt.i. This means that the new a.txt.i file is identical to the old a.txt.i file up to certain point -- the backup program should therefore only copy the final part of the file. Rsync is an example of a backup program that will notice this.

like image 27
Martin Geisler Avatar answered Oct 05 '22 17:10

Martin Geisler


Here's a hg extension that might help: TimestampExtension.

like image 35
djc Avatar answered Oct 05 '22 18:10

djc