Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

git splitting repository by subfolder and retain all old branches

I have a git repo with 2 directories and multiple branches, I want to split them and create all branches

`-- Big-repo     |-- dir1     `-- dir2  Branches : branch1, branch2, branch3 ... 

What I want

I want to split dir1 and dir2 as two separate repos and retain branches branch1, branch2 ... in both repositories.

dir1 Branches : branch1, branch2, branch3 ...  dir2 Branches : branch1, branch2, branch3 ... 

What I tried:

I am able to split them into 2 repos using

git subtree split -P dir1 -b dir1-only  git subtree split -P dir2 -b dir2-only  

But, it is not creating any branches after separation.

To get all branches:

git checkout branch1 (in Big-repo) git subtree split -p dir1 -b dir1-branch1  git checkout branch2 (in Big-repo) git subtree split -p dir1 -b dir1-branch2  And push these branches to newly created repo. 

This involves more manual effort and I am sure there might be a quick way to achieve this?

Any ideas???

like image 785
Sridhar Avatar asked Dec 24 '13 08:12

Sridhar


People also ask

Does deleting branches reduce repository size?

Deleting files in a commit doesn't actually reduce the size of the repo since the earlier commits and blobs are still around. What you need to do is rewrite history with Git's filter-branch option.


1 Answers

Short answer

git filter-branch offers exactly the functionality you want. With the --subdirectory-filter option you can create a new set of commits where the contents of subDirectory are at the root of the directory.

git filter-branch --prune-empty --subdirectory-filter subDirectory -- --branches 

Walkthrough

The following is an example to perform this in a safe way. You need to perform this for each subdirectory that will be isolated into its own repo, in this case dir1.

First clone your repository to keep the changes isolated:

git clone yourRemote dir1Clone cd dir1Clone 

To prepare the cloned repository we will recreate all remote branches as local ones. We skip the one starting with * since that is the current branch, which in this case would read (no branch) since we are in a headless state:

# move to a headless state # in order to delete all branches without issues git checkout --detach  # delete all branches git branch | grep --invert-match "*" | xargs git branch -D 

To recreate all remote branches locally we go through the results of git branch --remotes. We skip the ones containing -> since those are not branches:

# get all local branches for remote git branch --remotes --no-color | grep --invert-match "\->" | while read remote; do     git checkout --track "$remote" done  # remove remote and remote branches git remote remove origin 

Finally run the filter-branch command. This will create new commits with all the commits that touch the dir1 subdirectory. All branches that also touch this subdirectory will get updated. The output will list all the references that where not updated, which is the case for branches that do not touch dir1 at all.

# Isolate dir1 and recreate branches # --prune-empty removes all commits that do not modify dir1 # -- --all updates all existing references, which is all existing branches git filter-branch --prune-empty --subdirectory-filter dir1 -- --all 

After this you will have a new set of commits that have dir1 at the root of the repository. Just add your remote to push the new commits, or use these as a new repository altogether.

As an additional last step if you care about the repository size:

Even if all branches where updated your repository will still have all the objects of the original repository, tho only reachable through the ref-logs. If you want to drop these read how to garbage collect commits

Some additional resources:

  • Github Teaching for filter branch
  • Git Book for rewriting history
like image 134
Maic López Sáenz Avatar answered Sep 23 '22 13:09

Maic López Sáenz