Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

git: How to split off library from project? filter-branch, subtree?

Tags:

So, I've a bigger (closed source) project, and in the context of this project created a library which could also be useful elsewhere, I think.

I now want to split off the library in its own project, which could go as open source on github or similar. Of course, the library (and its history there) should contain no traces of our project.

git-subtree seems like a solution here, but it does not completely fit.

My directory layout is something like this (since it is a Java project):

  • fencing-game (git workdir)
    • src
      • de
        • fencing_game
          • transport (my library)
            • protocol (part of the library)
            • fencing (part of the main project interfacing with the library)
            • client (part of the main project interfacing with the library)
            • server (part of the main project interfacing with the library)
          • client (part of the main project)
          • server (part of the main project)
          • ... (part of the main project)
    • other files and directories (build system, website and such - part of the main project)

After the split, I want the library's directory layout look like this (including any files directly in the bold directories):

  • my-library (name to be determined)
    • src
      • de
        • fencing_game
          • transport (my library)
            • protocol (part of the library)

The history should also contain just the part of the main project's history which touches this part of the repository.

A first look showed me git-subtree split --prefix=src/de/fencing_ame/transport, but this will

  1. give me a tree rooted in transport (which will not compile) and
  2. include the transport/client, transport/server and transport/fencing directories.

The first point could be mitigated by using git subtree add --prefix=src/de/fencing_ame/transport <commit> on the receiving side, but I don't think git-subtree can do much against exporting also these subdirectories. (The idea really is to just be able to share the complete tree here).

Do I have to use git filter-branch here?

After the split, I want to be able to import back the library in my main project, either using git-subtree or git-submodule, in a separate subdirectory rather than where it is now. I imagine the layout this way

  • fencing-game (git workdir)
    • src
      • de
        • fencing_game
          • transport (empty)
            • fencing (part of the main project interfacing with the library)
            • client (part of the main project interfacing with the library)
            • server (part of the main project interfacing with the library)
          • client (part of the main project)
          • server (part of the main project)
          • ... (part of the main project)
    • my-library
      • src
        • de
          • fencing_game
            • transport (my library)
              • protocol (part of the library)
    • other files and directories (build system, website and such - part of the main project)
What would be the most pain-free way to do this? Are there other tools than git-subtree and git-filter-branch for this goal?
like image 539
Paŭlo Ebermann Avatar asked Jun 19 '11 16:06

Paŭlo Ebermann


People also ask

What is git subtree split?

This takes advantage of the fact you want to move to another repo, so we can extract the subtree, and then relocate it in separate steps. Use git subtree split to extract the files you want to the an intermediate branch in your repository (you have already done this). git subtree split -P lib3 -b new-branch.

Which term defines git repositories that are included as subdirectories in another repository?

Git addresses this issue using submodules. Submodules allow you to keep a Git repository as a subdirectory of another Git repository. This lets you clone another repository into your project and keep your commits separate.

What is git subtree?

git subtree lets you nest one repository inside another as a sub-directory. It is one of several ways Git projects can manage project dependencies. Why you may want to consider git subtree. Management of a simple workflow is easy.


2 Answers

I think you've got some real spelunking to do. If you just want to split off "protocol", you can do that with "git subtree split ..." or "git filter-branch ..."

git filter-branch --subdirectory-filter fencing-game/src/de/fencing_game/transport/protocol -- --all

But if you have files in transport as well as transport/protocol, it starts to get hairy.

I wrote some custom tools to do this for a project I was on. They're not published anywhere, but you can do something similar with reposurgeon.

like image 145
Phil Hord Avatar answered Sep 29 '22 12:09

Phil Hord


Splitting a subtree mixed with files from the parent project

This seems to be a common request, however I don't think there's a simple answer, when the folders are mixed together like that.

The general method I suggest to split out the library mixed in with other folders is this:

  1. Make a branch with the new root for the library:

    git subtree split -P src/de/fencing_game -b temp-br git checkout temp-br  # -or-, if you really want to keep the full path:  git checkout -b temp-br cd src/de/fencing_game 
  2. Then use something to re-write history to remove the parts that aren't part of the library. I'm not expert on this but I was able to experiment and found something like this to work:

    git filter-branch --tag-name-filter cat --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch client server otherstuff' HEAD  # also clear out stuff from the sub dir cd transport  git filter-branch --tag-name-filter cat --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch fencing client server' HEAD 

    Note: You might need to delete the back-up made by filter-branch between successive commands.

    git update-ref -d refs/original/refs/heads/temp-br 
  3. Lastly, just create a new repo for the library and pull in everything that's left:

    cd <new-lib-repo> git init git pull <original-repo> temp-br 

I recommend that your final library path be more like /transport/protocol instead of the full parent project path since that seems kind of tied to the project.

like image 39
johnb003 Avatar answered Sep 29 '22 11:09

johnb003