Sometimes I find that I've got a file that has over time grown to contain more classes/functions/whatevers than I like. It's time to refactor! I usually find in this case that my one file becomes several: itself plus several other files, each containing distinct segments of the file.
Unfortunately, just creating these new files "breaks" history a bit -- it's hard to tell that those functions originally came from another file. It's even worse if there were any other changes made to the code during the refactoring.
One of my coworkers found that he could "abuse" the rename functionality by doing something like this:
hg rename --after original_file new_file_1
hg rename --after original_file new_file_2
hg rename --after original_file new_file_3
hg add original_file
The result is that each of the new files looks like a rename with the rest of the file removed, and the original file looks like it lost the removed blocks. So far, this seems ideal. However, I'm concerned that these multiple renames are going to cause some confused merges down the line.
Is there anything wrong with this "multiple renames" approach?
You should make sure you know what hg copy
really means before doing this.
In short, copying a file from original_file
to new_file_1
adds a link that Mercurial will use in future merges if and only if it cannot find new_file_1
in the common ancestor. This will typically only be the case in the first merge after you make the copy.
A graph might illustrate this better:
old --- edit old --- edit in old copied to new --- edit old --- merge
\ / /
copy old new --/------- edit new ------------/
We start with a changeset where you have the file old
. You then edit old
on one branch and copy old
to new
in another. In the first merge the edit to old
is copied into new
. In the second merge there's no special treatment for new
since new
is found in the common ancestor (the copy old new
changeset).
What this means for your case is that there is a big difference in future merges depending on when people see the copy old new
. If you can get everybody to use
old --- copy old new
as their starting point, then things are fine. But if someone has branches off from the old
changeset and actually edited old
in that branch, then they'll get merge conflicts when they try to merge with the copy old new
changeset.
More precisely, they get merge conflicts if they've edited any part of the old
file that was not copied into the new
file. The merge conflicts alert you to the fact that there was a change in old
that needs to be copied into new
. However, when you really did
hg copy old new1
hg copy old new2
hg copy old new3
then you'll get irrelevant merge conflicts in two of the three new files.
If you had just deleted the old
file and added three new files, then you would still get a merge conflict here: you'll be asked
remove changed old which local deleted
use (c)hanged version or leave (d)eleted?
Whether you prefer to see that prompt or see the merge tool start up is up to you — but now you know the consequences of hg copy
(or hg rename --after
, it's really the same thing).
Easier is to use hg copy
for that:
hg copy original_file new_file_1
hg copy original_file new_file_2
hg copy original_file new_file_3
Now all 3 have the original history. But, yes, either way this is perfectly okay and commonly done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With