Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git subtree and multiple directories

I have a rather large git repository that has a directory where I maintain library code. The directory contains a number of subdirectories.

repo
+--- lib
|    +--- A
|    +--- B
...
|    +--- Z

Now let us assume that I want to open source subdirectories A,...,M and keep subdirectories N,...,Z close sourced. Let us also assume that I would like to:

  • Keep A,...,M in a single open source repository. The reason for this is that the directories A,...,M have interdependencies and it would be confusing to split them into individual repositories.
  • Keep the structure of my closed source repository intact. For example, I could create subdirectories lib/pub and lib/pvt, but this would have cascading effects requiring changing references elsewhere or would require a lot of symlinks (lib/A -> lib/pub/A).
  • Have a solution akin to git subtree where I can modify code either in my closed source repository or in the open source one and I can easily sync the changes between the two repositories.

I have searched for a solution in both stackoverflow and google, but there does not seem to be an obvious one. Conceptually this is something that git subtree should be able to do, but it only works with a single subdirectory.

I have looked into the git-subtree script with the intent of modifying it.

https://github.com/git/git/blob/master/contrib/subtree/git-subtree.sh

It appears to me that if I was to modify subtree_for_commit() I should be able to convince git subtree split to consider more than a single directory for splitting. But my knowledge of git is not enough to understand what the script is doing and modify it without breaking things.

If you have any solution for the above mentioned problem or any other pointers in modifying git-subtree, please let me know.

like image 281
Bill Zissimopoulos Avatar asked Feb 07 '14 06:02

Bill Zissimopoulos


3 Answers

Splitting a subtree mixed with files from the parent project

This seems to be a common request, however I don't think there's a simple answer, when the folders are mixed together like that.

The general method I suggest to split out the library mixed in with other folders is this:

  1. Make a branch with the new root for the library directories:

    git subtree split -P lib/ -b temp-br
    git checkout temp-br
    
  2. Then use something to re-write history to remove the parts that aren't part of the library. I'm not expert on this but I was able to experiment and found something like this to work:

    git filter-branch --tag-name-filter cat --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch N O P Q R S T U V W X Y Z' HEAD
    

    Note: You might need to delete the back-up made by filter-branch if you make successive commands.

    git update-ref -d refs/original/refs/heads/temp-br
    
  3. Lastly, just create a new repo for the library and pull in everything that's left:

    cd <new-lib-repo>
    git init
    git pull <original-repo> temp-br
    
like image 86
johnb003 Avatar answered Oct 19 '22 21:10

johnb003


Here is a shell script based on git subtree, it is much faster than solutions which are based on git filter-branch --tree-filter; its side effect is several extra git mv and git merge commits will be generated and added to final HEAD. If you feel ok of these extra empty commits, you can try:

ids=0
lists=(\
    "a/b" \
    "c/d/e" \
)
# subtree each path
for dir in ${lists[@]}
do
    echo git subtree split -P $dir -b split_dir_$ids
    git subtree split -P $dir -b split_dir_$ids
    ((ids++))
done

# restore folder structure
for (( idx=0; idx < ${#lists[@]}; idx++ ))
do
    git checkout split_dir_$idx
    dir=${lists[$idx]}
    mkdir -p $dir
    dirPrefix=${$dir%%/*}
    find . -maxdepth 1 ! -name $dirPrefix -and ! -name '\.*' \
        -exec git mv {} $dir \;
done

# merge
git checkout split_dir_0
for (( idx=1; idx < ${#lists[@]}; idx++ ))
do
    git merge -q split_dir_$idx
done

git push -u `target remote` `target branch`
like image 40
dasons Avatar answered Oct 19 '22 20:10

dasons


Use git subtree add

See Git subtree split two directories, I think you may use that technique for more than two directories, even for multiple repos, i.e.

cd /repos/big-repo

# split out A..M branches
for N in {A..M}; do
  git subtree split --prefix=lib/$N --branch=split-$N
done

# create new repo
mkdir /repos/am-repo
cd /repos/am-repo
git init

# commit something or git-subtree add will complain and fail
touch .gitignore; git add .; git commit -m "begin history revision"

# split-in A..M branches
for N in {A..M}; do
  git subtree add --prefix=lib/$N ../big-repo split-$N
done
like image 32
laconbass Avatar answered Oct 19 '22 21:10

laconbass