Merge the files into the new repository B. Step 2: Go to that directory. Step 3: Create a remote connection to repository A as a branch in repository B. Step 4: Pull files and history from this branch (containing only the directory you want to move) into repository B.
Instead of having to deal with a subshell and using ext glob (as kynan suggested), try this much simpler approach:
git filter-branch --index-filter 'git rm --cached -qr --ignore-unmatch -- . && git reset -q $GIT_COMMIT -- apps/AAA libs/XXX' --prune-empty -- --all
As mentioned by void.pointer's comment, this will remove everything except apps/AAA
and libs/XXX
from current repository.
This leaves behind lots of empty merges. These can be removed by another pass as described by raphinesse in his answer:
git filter-branch --prune-empty --parent-filter \
'sed "s/-p //g" | xargs -r git show-branch --independent | sed "s/\</-p /g"'
⚠️ Warning: The above must use GNU version of sed
and xargs
otherwise it would remove all commits as xargs
fails. brew install gnu-sed findutils
and then use gsed
and gxargs
:
git filter-branch --prune-empty --parent-filter \
'gsed "s/-p //g" | gxargs git show-branch --independent | gsed "s/\</-p /g"'
The plan is to split individual directories into its own repos, then merge them together. The following manual steps did not employ geek-to-use scripts but easy-to-understand commands and could help merge extra N sub-folders into another single repository.
Divide
Let's assume your original repo is: original_repo
1 - Split apps:
git clone original_repo apps-repo
cd apps-repo
git filter-branch --prune-empty --subdirectory-filter apps master
2 - Split libs
git clone original_repo libs-repo
cd libs-repo
git filter-branch --prune-empty --subdirectory-filter libs master
Continue if you have more than 2 folders. Now you shall have two new and temporary git repository.
Conquer by Merging apps and libs
3 - Prepare the brand new repo:
mkdir my-desired-repo
cd my-desired-repo
git init
And you will need to make at least one commit. If the following three lines should be skipped, your first repo will appear immediate under your repo's root:
touch a_file_and_make_a_commit # see user's feedback
git add a_file_and_make_a_commit
git commit -am "at least one commit is needed for it to work"
With the temp file commited, merge
command in later section will stop as expected.
Taking from user's feedback, instead of adding a random file like a_file_and_make_a_commit
, you can choose to add a .gitignore
, or README.md
etc.
4 - Merge apps repo first:
git remote add apps-repo ../apps-repo
git fetch apps-repo
git merge -s ours --no-commit apps-repo/master # see below note.
git read-tree --prefix=apps -u apps-repo/master
git commit -m "import apps"
Now you should see apps directory inside your new repository. git log
should show all relevant historical commit messages.
Note: as Chris noted below in the comments, for newer version(>=2.9) of git, you need to specify --allow-unrelated-histories
with git merge
5 - Merge libs repo next in the same way:
git remote add libs-repo ../libs-repo
git fetch libs-repo
git merge -s ours --no-commit libs-repo/master # see above note.
git read-tree --prefix=libs -u libs-repo/master
git commit -m "import libs"
Continue if you have more than 2 repos to merge.
Reference: Merge a subdirectory of another repository with git
I had a similar issue and, after reviewing the various approaches listed here, I discovered git-filter-repo. It is recommended as an alternative to git-filter-branch in the official git documentation here.
To create a new repository from a subset of directories in an existing repository, you can use the command:
git filter-repo --path <file_to_keep>
Filter multiple files/folders by chaining them:
git filter-repo --path keepthisfile --path keepthisfolder/
So, to answer the original question, with git-filter-repo you would just need the following command:
git filter-repo --path apps/AAA/ --path libs/XXX/
Why would you want to run filter-branch
more than once? You can do it all in one sweep, so no need to force it (note that you need extglob
enabled in your shell for this to work):
git filter-branch --index-filter "git rm -r -f --cached --ignore-unmatch $(ls -xd apps/!(AAA) libs/!(XXX))" --prune-empty -- --all
This should get rid of all the changes in the unwanted subdirectories and keep all your branches and commits (unless they only affect files in the pruned subdirectories, by virtue of --prune-empty
) - no issue with duplicate commits etc.
After this operation the unwanted directories will be listed as untracked by git status
.
The $(ls ...)
is necessary s.t. the extglob
is evaluated by your shell instead of the index filter, which uses the sh
builtin eval
(where extglob
is not available). See How do I enable shell options in git? for further details on that.
Answering my own question here... after a lot of trial and error.
I managed to do this using a combination of git subtree
and git-stitch-repo
. These instructions are based on:
First, I pulled out the directories I wanted to keep into their own separate repository:
cd origRepo
git subtree split -P apps/AAA -b aaa
git subtree split -P libs/XXX -b xxx
cd ..
mkdir aaaRepo
cd aaaRepo
git init
git fetch ../origRepo aaa
git checkout -b master FETCH_HEAD
cd ..
mkdir xxxRepo
cd xxxRepo
git init
git fetch ../origRepo xxx
git checkout -b master FETCH_HEAD
I then created a new empty repository, and imported/stitched the last two into it:
cd ..
mkdir newRepo
cd newRepo
git init
git-stitch-repo ../aaaRepo:apps/AAA ../xxxRepo:libs/XXX | git fast-import
This creates two branches, master-A
and master-B
, each holding the content of one of the stitched repos. To combine them and clean up:
git checkout master-A
git pull . master-B
git checkout master
git branch -d master-A
git branch -d master-B
Now I'm not quite sure how/when this happens, but after the first checkout
and the pull
, the code magically merges into the master branch (any insight on what's going on here is appreciated!)
Everything seems to have worked as expected, except that if I look through the newRepo
commit history, there are duplicates when the changeset affected both apps/AAA
and libs/XXX
. If there is a way to remove duplicates, then it would be perfect.
I have writen a git filter to solve exactly this problem. It has the fantastic name of git_filter and is located at github here:
https://github.com/slobobaby/git_filter
It is based on the excellent libgit2.
I needed to split a large repository with many commits (~100000) and the solutions based on git filter-branch took several days to run. git_filter takes a minute to do the same thing.
git splits
is a bash script that is a wrapper around git branch-filter
that I created as a git extension, based on jkeating's solution.
It was made exactly for this situation. For your error, try using the git splits -f
option to force removal of the backup. Because git splits
operates on a new branch, it won't rewrite your current branch, so the backup is extraneous. See the readme for more detail and be sure to use it on a copy/clone of your repo ( just in case!).
git splits
. Split the directories into a local branch
#change into your repo's directory
cd /path/to/repo
#checkout the branch
git checkout XYZ
#split multiple directories into new branch XYZ
git splits -b XYZ apps/AAA libs/ZZZ
Create an empty repo somewhere. We'll assume we've created an empty repo called xyz
on GitHub that has path : [email protected]:simpliwp/xyz.git
Push to the new repo.
#add a new remote origin for the empty repo so we can push to the empty repo on GitHub
git remote add origin_xyz [email protected]:simpliwp/xyz.git
#push the branch to the empty repo's master branch
git push origin_xyz XYZ:master
Clone the newly created remote repo into a new local directory
#change current directory out of the old repo
cd /path/to/where/you/want/the/new/local/repo
#clone the remote repo you just pushed to
git clone [email protected]:simpliwp/xyz.git
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With