Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split subdirectory from a Git repository and keep history of all files that are in the subdir _now_

Maybe there is already a solution out there, but other questions/answers seem to address slightly different issues (or I don't understand them really).

My intention is to detach a subdirectory of a Git repository and make it an independent repository, while keeping the history intact, but only the history of the subdirectory. This question first seemed to do the trick, but then I noticed a flaw in it:

git filter-branch --subdirectory-filter only preserves commits that relate to the given subdirectory. But this means that commits are removed that affect files that are in that subdir now but have been moved there from other locations.

I noticed this because the first commit of my 'cleaned up' repository was 'Move everything to subdir X'. This means that my files had been at another location before, but the commits from that time weren't preserved.

So what I'd need is a command (or sequence of commands) that:

  • removes all commits in the repository
  • except commits that contain files that
    • are in the given subdirectory now or
    • are prior versions of these files in other locations.

B)
Possibly some of these commits also contain files that don't match these conditions. If these files could be pruned completely from the repository that would be a nice add-on.


Edit:

The solution linked above pulls the subdir content in the new repository to the root directory of the repo. As @Amber pointed out this would cause trouble with files that already had lived in the root dir. So what I would like to achieve is:

Original dir structure:

\Old-Repo
    \.git
    \ABC
    |- dir content
    \DEF
    |- dir content
    \GHI
    |- dir content

The dir structure of the detached repository should be:

\New-Repo-DEF
    \.git
    \DEF
    |- dir content

and not:

\New-Repo-DEF
    \.git
    content of old DEF subdirectory

Then I would afterwards move the content from the DEF subdir to the root dir with a regular commit.

like image 474
uli_1973 Avatar asked Apr 07 '13 22:04

uli_1973


People also ask

What is a git subdirectory?

git. Git uses this special subdirectory to store all the information about the project, including the tracked files and sub-directories located within the project's directory. If we ever delete the . git subdirectory, we will lose the project's history. Next, we will change the default branch to be called main .


1 Answers

Depending on how complicated the history is, it may be feasible to rewrite it and move the files with git filter-branch --tree-filter (as described here), before extracting subdirectory with --subdirectory-filter.

In other words, if git log -- somedir shows "Move files XYZ to somedir" as the oldest commit for somedir directory, you could do git filter-branch --tree-filter 'insert a fairly foolproof script here that moves files XYZ to somedir' HEAD. This way, you can straighten out the directory structure before extracting the subrepository.

I did this a few days ago on a fairly small repository (~150 commits, linear history) and it worked, but i don't think it would scale without some serious automation.

like image 79
Jan Warchoł Avatar answered Oct 11 '22 16:10

Jan Warchoł