Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git good practices to keep a forked project up-to-date with its source origin when both are evolving

Context

I'm the main author of Next Right Now which is an open source "boilerplate" containing several "presets" of building a web app using the Next.js framework. Each preset comes with built-in features and is meant to be forked so other can build their app based on it. Each preset lives in its own git branch, such as:

  • https://github.com/UnlyEd/next-right-now/tree/v2-mst-aptd-at-lcz-sty
  • https://github.com/UnlyEd/next-right-now/tree/v2-mst-aptd-gcms-lcz-sty

I'm working on NRN and making it evolve regularly. But, I also have forked one of the available NRN presets and made my own -proprietary- app from it.

Definitions

Here are some definitions in order to avoid terminology misunderstandings.

  • Fork: A NRN preset forked into another project, whether open-source or proprietary.
  • Source: The NRN preset that was used to generate the Fork. (as an example, let's say the Source git branch is https://github.com/UnlyEd/next-right-now/tree/v2-mst-aptd-at-lcz-sty, which is the NRN preset I've used to create my Fork)

Problem

The problem with this way of doing things is that I'm not sure how to keep the "Fork" in sync with the NRN boilerplate preset. Both evolve in their own way. Also, NRN is not a framework but a boilerplate, which is meant to be overridden to customize the base code, and this eventually leads to lots of conflicts between a Fork and the Source.

What I've been doing so far

In order to keep my Fork synced with the latest changes on the Source, I basically rebase my own work on top of the Source git history. (e.g: git rebase NRN-v2-mst-aptd-at-lcz-sty)

This has following advantages (pros):

  • It keeps the history clean and simple to understand/compare. I can rather know easily which was the latest commit I synced from the Source by comparing their history. All the work done in the Fork is done on top of what's been done in the Source.
  • The git tree is separated in two distinct parts, the Source commits tree and the Fork commit tree.
  • I can sync the new changes done from the Source into the Fork by using git rebase to get my Fork up-to-date and then push --force to override the remote.

But also a few disadvantages (cons):

  • It's not so complicated to deal with syncing between both branches when there is only one branch in the Fork, but it gets very messy as soon as there are several, because it rewrites the git history of all branches, it gets quite complicated when there is ongoing work on other "feature" branches in the Fork. First, I need to rebase the Fork:master and then rebase every branch with the Fork:master. If I do it the wrong way around it messes up the whole tree (I did the mistake once, and it was 2 painful hours of rebasing around with --force everywhere)
  • It uses --force on the Fork:master branch, which is not so great IMHO and can lead to quite a few troubles if not handled correctly. I'm a bit familiar with what I'm doing, but this wouldn't be viable if there were more people in the team.
  • Overall, I'm not confident in my own ability not to mess it up someday.
  • It doesn't feel adapted to a team, it works only because I'm solo working on this, IMHO.
  • When it conflicts, it can be painful to resolve and I happened to make mistakes on several occasions.
  • The git history is untrustworthy, my Fork working branches get their commit history rewritten upon sync with the Source, and all GitHub comments lose their usefulness because they don't match with any commit anymore.

Using rebase, I eventually had to wipe my whole working branch and recreate it from the Source by cherry-picking all commits I had done in the Fork, because the history didn't match anymore and I needed a clean start. This happened after I made a few mistakes by rebasing the wrong way around.

What I'm looking for

My current way works fine, as long as I'm solo, as long as I know my git branches well, as long as I don't mess it up by rebasing and pushing with --force the wrong way around. It doesn't satisfy me, though.

I'm looking for a better way, which would be usable for a team, and which I could use as the "officially recommended" way to keep a Fork in sync with its Source for NRN.

Alternatives

I've thought about cherry-pick-ing commits from the Source to my Fork, but I'm not sure if it's a better alternative, because it'd mix both Source and Fork commits together (no separation between both anymore). This would eventually lead to difficulties while comparing both trees and figuring out which commits have been cherry-picked and which haven't. Also, it doesn't protect me against forgetting to cherry-pick one commit and run into trouble weeks after that, which might lead to using --force to rewrite the history to include the missing commits at the right place.

I haven't considered any other alternative, as I don't know any.


So, I'm looking for "best practices" for my particular use-case. I'm pretty sure Git has some awesome ways of dealing with that, which are unknown to me.

like image 322
Vadorequest Avatar asked Oct 16 '22 00:10

Vadorequest


1 Answers

I see several options:

Merge

As some suggested the simplest option to "update" the fork with new commits on the original(root) repository is to merge. This will make sure:

  • you get the latest fixes from the root repository library/framework/ easily
  • your commits in the fork and the ones in the root are cleanly separated

I would discourage rebase for this particular problem. As you mention, the history of your forked repository will be effectively modified, and that could affect other developers working there / feature branches (even on a mono-developer repo) etc...

If you have to merge patches in the opposite direction, fork -> root, you would then git cherry-pick

git submodule

Another option is to have the base library/framework as a git submodule in the fork. In the standard form, a git submodule is just a pointer to another repository+commit. Histories are separated, as they are indeed two different repositories.

To integrate new changes on the base, you just need to repoint the git submodule to this new commit.

One important note; this would only work well if your forked repository doesn't touch the files of the root repository.

git subtree

I am not familiar enough with git subtree to be able to judge. But you should probably have a look too, as it sounds like another viable solution

like image 131
msune Avatar answered Oct 17 '22 13:10

msune