Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for git repositories on open source projects

I'm contributing to a fairly small open source project hosted on Github. So that other people can take advantage of my work, I've created my own fork on Github. Despite Github's choice of terminology, I don't wish to totally diverge from the main project. However, I don't expect or desire that all of my work is accepted into the main repository. Some of it however, already has been merged into the main repository and I expect this to continue. The problem I am running into is how best to keep our two trees in a state where code can be shared between them easily.

Some situations I have or will encountered include:

  • I commit code that is later accepted into the main repository. When I pull from this repository in the future, my commit is duplicated in my repository.
  • I commit code that is never accepted into the main repository. When I pull from this repository in the future, the two trees have diverged and fixing it is hard.
  • Another person comes along and bases their work on my repository. Thus, I should if at all possible avoid changing commits that I have pushed, for example by using git rebase.
  • I wish to submit code to the master repository. Ideally, my changes should easily be able to be transformed into patches (ideally using git format-patch) that can directly and cleanly apply to the master repository.

As far as I can tell there are two, or possibly three ways to handle this, none of which work particularly well:

  • Frequently run git rebase to keep my changes based off the head of the upstream repository. In this way I can eliminate duplicated commits but often have to rewrite history, causing problems for people wanting to derive their work from mine.
  • Frequently merge the upstream repository changes into mine. This works ok on my end but does not seem to make it easy to submit my code to the upstream repository.
  • Use some combination of these and possibly git cherry-pick to keep things in order.

What have other people done in this situation? I know my situation is analogous to the relationship between various kernel contributors and Linus's main repository, so hopefully there are good ways to handle this. I'm fairly new to git though, so haven't mastered all it's nuances. Finally, especially due to Github, my terminology may not be entirely consistent or correct. Feel free to correct me.

like image 308
orangejulius Avatar asked Sep 05 '09 03:09

orangejulius


1 Answers

Some tips I've learned from a similar situation:

  • Have a remote tracking branch for the upstream author's work.
  • Pull changes from this tracking branch into your master branch every so often.
  • Create a new branch for each of the topics you're working on. These branches should generally be local only. When you get changes from upstream into master, rebase your topic branches to reflect these changes.
  • When you're done with some topic work, merge into master. This way, people who are deriving work from yours, will not see too much rewritten history, since the rebasing occurred in your local topic branches.
  • Submitting changes: Your master branch will basically be a series of commits, some of which are the same as upstream, the rest are yours. The latter can be sent as patches if you want to.

Of course, your choice of branch names and remotes are your own. I'm not sure these are exhaustive to the scenario, but they cover most of my hurdles.

like image 189
sykora Avatar answered Sep 20 '22 15:09

sykora