How to git move directory from one branch to another while preserving history?
I want to move in the same repo.
You can't (quite) get what you want—but if you try sometimes, you might just find, you get what you need. 😀
All you have to do is rename the directory and commit. You may first need to extract the directory from the other commit, if it's not already in the commit at the tip of the current branch. That is, you might need an initial:
git checkout otherbranch -- path/to/directory
git commit # optional, but see below
and then in any case, run:
git mv path/to/directory new/path/to/dir
and then git commit
the result. That doesn't do what you want, but it might do what you need—especially if you make that first commit that doesn't have the renaming, so that you have two adjacent commits, one with the old names, and one with the new ones.
You may, instead, want to merge the branches, commit the merge, and only then rename and commit again. Whether you want this, and why, requires the long explanation.
It's important to understand two things here:
People often object to the idea that Git stores snapshots, because git log -p
shows patches, i.e., changes. You view commit 0a36ca1
, say, and you see some change to README.md
. Then git log
goes on to commit 0a36ca1
's parent commit 922bf37
, say, and you see another change to README.md
and/or some other files, and so on. Doesn't that mean 0a36ca1
just stores the changes to README.md
? And the answer is: no, 0a36ca1
stores a full copy of README.md
and all the other files. Git showed changes by inspecting both 922bf37
—the parent of 0a36ca1
, i.e., the commit that comes just before 0a36ca1
—and 0a36ca1
. Both commits have copies of every file. Git compared the two commits' files. All of the files in those two commits matched except for README.md
. Git then compared the two README.md
versions to see what changed, and showed you what changed in that file.
The git show
command is similar, except that you typically give it the hash ID of one commit, and git show
prints the commit's metadata (who made it, when, and why—the log message) and then compares the snapshot in the parent to the snapshot in that commit. Whatever's different, is what files you see.
When you ask for history with git log
by running git log
or git log master
, Git:
git log
), or the last commit in master
, and shows you that commit a la git show
.2
This repeats until Git runs out of parents, or you get tired of paging through git log
output. Given a nice simple linear chain of commits, like:
A <-B <-C <-D <-E <-F <-G <-- master
(the single uppercase letters stand in here for the big ugly hash IDs that Git really uses), Git starts by showing you G
(as found by the name master
), then moving to F
and showing you F
, then moving to E
and showing E
, and so on. Commit A
is the very first commit in the repository—it has no parent; there's no backwards arrow coming out of A
to let Git move left—so git show
shows it as having every file created from scratch. That means git log -p
shows it the same way. And of course, with no parent, there's no arrow to follow backwards.
1Technically, directories turn into internal tree objects, but Git won't store an empty directory for the simple reason that you can't get a directory into Git's index, and Git doesn't built commits from the work-tree, but rather from the index. It's easier to think of Git as just storing files, since that's the end effect.
2This assumes you're using git log -p
, of course. There are several important difference between git log
and git show
: first, git log
does this backwards walk; second, git log
defaults to not showing a patch; and third, git show
shows merge commits in a different default manner: git log -p
defaults to saying to itself: ugh, a merge commit, that's too hard: I'll just print the log message and move on, without showing a diff at all. The git show
default here is to show a combined diff, which is a reduced form of diff against multiple parents.
git log
can show a subset of historyYou can, instead of just running git log
or git log master
, run:
git log master -- path/to/file.ext
and see what appears to be the history of path/to/file.ext
. What git log
is doing here is walking commit history as usual, but then not showing some of the commits. That is, given our simple linear chain above, git log
starts with commit G
. It compares (the snapshots of) F
and G
to see what files changed. If those files do include path/to/file.ext
, git log
shows commit G
. Then it moves back to commit F
, even if it showed nothing at all.
In other words, instead of just showing you all the commits it walks, git log
can show you selected commits from the walk. The result is that it seems like Git has file history—but it doesn't: it's just synthesizing a subset history from the real history, working as it goes.
This is important because when Git is doing this synthetic file-history creation, git log
is modifying the commit walk. The git log
documentation calls this History Simplification, and it's complicated. There are a half dozen or so git log
options to control how history simplification will be performed. This means that the "file history" that you see with git log
depends on what options you pass to git log
, as well as what the actual commit history is.
(Read and study the History Simplification section, at least someday, because there is a lot to it. I've been using Git for a long time, and like to think I know a lot about it, but even then I have to refer back to the documentation for this. In particular, the notion of "TREESAME"—which applies after subtracting away unwanted tree components from each commit—and which commits are followed at merges is especially tricky.)
--follow
, git log
will try to detect renamesAs Git is doing this commit-by-commit, backwards traversal of a chain of commits, the diff from parent to child may show that some file is renamed. A file named README
may have been renamed to one named README.md
, for instance. A simple:
git log master -- README.md
will show you how README.md
evolved over time (backwards), but stop when README.md
was named README
, because it's looking for README.md
and commits from here on back don't have README.md
.
When you add --follow
to git log
, it will follow that one file—it only works with one file!—across the rename, simply by changing which file it's looking for. Having detected that at, say, the commit-D
-to-E
boundary, the file that is now README.md
was called README
in commit D
, git log
stops looking for README.md
and starts, with D
, looking for changes to a file named README
. It's really that simple.
--follow
is too simple for your use caseThe problem here is that --follow
is that simple, which is too simple. So it won't do what you want, for two reasons:
First, you're talking about copying files across some fairly large gap:
...--F--G--H <-- master
\
N--O--P <-- branch
If your directory-full-of-files is in commit H
on master
, and you're just now copying it to a new commit you'll make on branch
that comes after commit P
, well, there's no backwards link from P
to H
. That's why I proposed that you commit the files without renaming them, then rename them and commit again. The result will be:
...--F--G--H <-- master
\
N--O--P--Q--R <-- branch
where commit R
has the files renamed, and Q
has them not-renamed, just copied from H
. In the commit log message for Q
, you can state that the entire directory has been copied from branch master
at a time when it pointed to commit H
(use H
's real hash ID here—run git rev-parse master
to see what hash ID master
specifies right now). Then you rename the directory and commit again to make them show up as renames, whenever Git walks from commit R
back to commit Q
.
The git log --follow
option only works on one file. That is, given a commit that is or descends from R
, and therefore has the new directory name, you must run:
git log --follow [<commit-hash>] [--] new/path/to/dir/file.ext
which will eventually work its way to commit R
, show new/path/to/dir/file.ext
(because it is renamed in commit R
as compared to commit Q
), then move back to commit Q
and start looking for path/to/directory/file.ext
.
From this single detected rename, plus the log messages in Q
and R
, you—a smart human rather than a dumb Git program that just obeys really simple rules—can conclude that, aha, all of those files came from commit H
.
This is where you may want a real merge. Instead of just copying the files from H
, you can literally make commit Q
as a merge commit, connecting the history from commit Q
back to both commits: P
and H
. That is, suppose you end up with:
...--F--G--H <-- master
\ \
N--O--P----Q--R <-- branch
Now when Git walks through commit history, it goes: R
, Q
, H
-and-P
, G
-and-O
, F
-and-N
, and so on. That is, git log
walks through the actual history, one commit at a time, with a kind of complicated method of tracking through the fork in history where commits H
and P
merge to form commit Q
.
The drawback to doing this merge is kind of obvious: it's a merge. It will by default bring in all the changes since some common ancestor—since whatever commit comes before N
and before F
where branch
and master
eventually lead back to a shared commit: a commit that's on both branches. You don't necessarily have to commit those changes, or even any changes: you can make commit Q
's snapshot match commit P
's, except of course for the new directory that you want.
(There are multiple ways to make this merge. How to achieve it is another StackOverflow question entirely, one that's already well answered. See (Git Merging) When to use 'ours' strategy, 'ours' option and 'theirs' option?, and also VonC's answer to a different question here. There are many options here but you probably would want to start with git merge -s ours --no-commit
, if you want -s ours
at all, and then the extraction of the files with git checkout <commit> -- <path>
, and only then making commit Q
as a merge.)
The advantage to the merge is that it ties the histories together, so that git log
can walk from merge Q
back to commit H
, which is the source of actual history for the (pre-renaming) files. The disadvantage is that it ties the histories together, so that from then on, Git believes that the correct result of merging H
with P
is Q
, even if you later change your mind about that.
If the merge isn't what you want, the commit(s) plus log message(s) may be what you need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With