Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tracking 3rd party code with Git

Tags:

I can't seem to grok the different solutions I've found and studied for tracking external code. Let alone understand how to apply them to my use case...

Would you guys be so kind to shed some light on this and help me with my specific use case? What would be the best solution for the following, concrete problem? (I'm not gonna attempt to generalize my problem, since I might make wrong assumptions about stuff, especially since I'm so new with all this...)

I'm building a website in Django (a web framework in Python). Now, there are a lot of 3rd party plugins available for use with Django (Django calls them 'apps'), that you can drop in your project. Some of these apps might require a bit of modification to get working like I want them. But if you start making modifications to 3rd party code you introduce the problem of updating that code when newer versions appear AND at the same time keeping your local modifications.

So, the way I would do that in Subversion is by using vendor branches. My repository layout would look like this:

/trunk   ...   /apps     /blog-app   ... /tags   ... /branches   ... /vendor   /django-apps     /blog-app       /1.2       /1.3       /current     /other-app       /3.2       /current 

In this case /trunk/apps/blog-app would have been svn copy'd of one of the tags in /vendor/django-apps/blog-app. Say that it was v1.2. And that I now want to upgrade my version in trunk to v1.3. As you can see, I have already updated /vendor/django-apps/blog-app/current (using svn_load_dirs) and 'tagged' (svn copy) it as /vendor/django-apps/blog-app/1.3. Now I can update /trunk/apps/blog-app by svn merge'ing the changes between /vendor/django-apps/blog-app/1.2 and /vendor/django-apps/blog-app/1.3 on /trunk/apps/blog-app. This will keep my local changes. (for people unknown with this process, it is described in the Subversion handbook: http://svnbook.red-bean.com/en/1.5/svn.advanced.vendorbr.html)

Now I want to do this whole process in Git. How can I do this?

Let me re-iterate the requirements:

  • I must be able to place the external code in an arbitrary position in the tree
  • I must be able to modify the external code and keep (commit) these modifications in my Git repos
  • I must be able to easily update the external code, should a new version be released, whilst keeping my changes

Extra (for bonus points ;-) ):

  • Preferably I want to do this without something like svn_load_dirs. I think it should be possible to track the apps and their updates straight from their repository (most 3rd party Django apps are kept in Subversion). Giving me the added benefit of being able to view individual commit messages between releases. And fixing merge conflicts more easily since I can deal with a lot of small commits instead of the one artificial commit created by svn_load_dirs. I think one would do this with svn:externals in Subversion, but I have never worked with that before...

A solution where a combination of both methods could be used would be even more preferable, since there might be app developers who don't use source control or don't make their repos available publicly. (Meaning both svn_load_dirs-like behavior and tracking straight from a Subversion reposity (or another Git))

I think I would either have to use subtrees, submodules, rebase, branches, ... or a combination of those, but smack down me if I know which one(s) or how do to it :S

I'm eagerly awaiting your responses! Please be as verbose as possible when replying, since I already had a hard time understanding other examples found online.

Thanks in advance

like image 204
hopla Avatar asked Mar 24 '09 13:03

hopla


1 Answers

There are two separate problems here:

  1. How do you maintain local forks of remote projects, and
  2. How do you keep a copy of remote projects in your own tree?

Problem 1 is pretty easy by itself. Just do something like:

git clone git://example.com/foo.git cd foo git remote add upstream git://example.com/foo.git git remote rm origin git remote add origin ssh://.../my-forked-foo.git git push origin 

You can then work on your forked repository normally. When you want to merge in upstream changes, run:

git pull upstream master 

As for problem 2, one option is to use submodules. For this, cd into your main project, and run:

git submodule add ssh://.../my-forked-foo.git local/path/for/foo 

If I use git submodules, what do I need to know?

You may find git submodules to be a little bit tricky at times. Here are some things to keep in mind:

  1. Always commit the submodule before committing the parent.
  2. Always push the submodule before pushing the parent.
  3. Make sure that the submodule's HEAD points to a branch before committing to it. (If you're a bash user, I recommend using git-completion to put the current branch name in your prompt.)
  4. Always run 'git submodule update' after switching branches or pulling changes.

You can work around (4) to a certain extent by using an alias created by one of my coworkers:

git config --global alias.pull-recursive '!git pull && git submodule update --init' 

...and then running:

git pull-recursive 

If git submodules are so tricky, what are the advantages?

  1. You can check out the main project without checking out the submodules. This is useful when the submodules are huge, and you don't need them on certain platforms.
  2. If you have experienced git users, it's possible to have multiple forks of your submodule, and link them with different forks of your main project.
  3. Someday, somebody might actually fix git submodules to work more gracefully. The deepest parts of the submodule implementation are actually quite good; it's just the upper-level tools that are broken.

git submodules aren't for me. What next?

If you don't want to use git submodules, you might want to look into git merge's subtree strategy. This keeps everything in one repository.

What if the upstream repository uses Subversion?

This is pretty easy if you know how to use git svn:

git svn clone -s https://example.com/foo cd foo git remote add origin ssh://.../my-forked-foo.git git push origin 

Then set up a local tracking branch in git.

git push origin master:local-fork git checkout -b local-fork origin/local-fork 

Then, to merge from upstream, run:

git svn fetch git merge trunk 

(I haven't tested this code, but it's more-or-less how we maintain one submodule with an upstream SVN repository.)

Don't use git svn rebase, because it will make it very difficult to use git submodule in the parent project without losing data. Just treat the Subversion branches as read-only mirrors of upstream, and merge from them explicitly.

If you need to access the upstream Subversion repository on another machine, try:

git svn init -s https://example.com/foo git svn fetch 

You should then be able to merge changes from upstream as before.

like image 127
emk Avatar answered Sep 17 '22 22:09

emk