Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it okay to "abuse" Mercurial's rename functionality to track movement of code blocks?

Sometimes I find that I've got a file that has over time grown to contain more classes/functions/whatevers than I like. It's time to refactor! I usually find in this case that my one file becomes several: itself plus several other files, each containing distinct segments of the file.

Unfortunately, just creating these new files "breaks" history a bit -- it's hard to tell that those functions originally came from another file. It's even worse if there were any other changes made to the code during the refactoring.

One of my coworkers found that he could "abuse" the rename functionality by doing something like this:

hg rename --after original_file new_file_1
hg rename --after original_file new_file_2
hg rename --after original_file new_file_3
hg add original_file

The result is that each of the new files looks like a rename with the rest of the file removed, and the original file looks like it lost the removed blocks. So far, this seems ideal. However, I'm concerned that these multiple renames are going to cause some confused merges down the line.

Is there anything wrong with this "multiple renames" approach?

like image 552
Chris Phillips Avatar asked Mar 27 '12 17:03

Chris Phillips


2 Answers

You should make sure you know what hg copy really means before doing this.

In short, copying a file from original_file to new_file_1 adds a link that Mercurial will use in future merges if and only if it cannot find new_file_1 in the common ancestor. This will typically only be the case in the first merge after you make the copy.

A graph might illustrate this better:

old --- edit old --- edit in old copied to new --- edit old --- merge
   \                /                             /
    copy old new --/------- edit new ------------/

We start with a changeset where you have the file old. You then edit old on one branch and copy old to new in another. In the first merge the edit to old is copied into new. In the second merge there's no special treatment for new since new is found in the common ancestor (the copy old new changeset).

What this means for your case is that there is a big difference in future merges depending on when people see the copy old new. If you can get everybody to use

old --- copy old new

as their starting point, then things are fine. But if someone has branches off from the old changeset and actually edited old in that branch, then they'll get merge conflicts when they try to merge with the copy old new changeset.

More precisely, they get merge conflicts if they've edited any part of the old file that was not copied into the new file. The merge conflicts alert you to the fact that there was a change in old that needs to be copied into new. However, when you really did

hg copy old new1
hg copy old new2
hg copy old new3

then you'll get irrelevant merge conflicts in two of the three new files.

If you had just deleted the old file and added three new files, then you would still get a merge conflict here: you'll be asked

remove changed old which local deleted
use (c)hanged version or leave (d)eleted?

Whether you prefer to see that prompt or see the merge tool start up is up to you — but now you know the consequences of hg copy (or hg rename --after, it's really the same thing).

like image 126
Martin Geisler Avatar answered Nov 14 '22 12:11

Martin Geisler


Easier is to use hg copy for that:

hg copy original_file new_file_1
hg copy original_file new_file_2
hg copy original_file new_file_3

Now all 3 have the original history. But, yes, either way this is perfectly okay and commonly done.

like image 9
Ry4an Brase Avatar answered Nov 14 '22 12:11

Ry4an Brase