Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git's performance with Big Commits vs. tiny commits

All the coding standards and good practices talk aside, how does Git itself technically deal with huge commits vs small commits. For example, is Git smarter with branch merges (e.g. less conflicts) with either of the cases, does garbage collection become more efficient, or something similar? Or is there any difference?

To be clear, I mean the scenario when code is being modified from A to B, and the "huge commit" is just straight changing the code from A to B, while the "small commit" has a lot of intermediate commits (say, for each small feature change), but eventually end up in the exact same B.

like image 327
Henrik Paul Avatar asked May 04 '12 06:05

Henrik Paul


1 Answers

The "many small commits" will use slightly more space, but the difference is likely not worth paying the price of lost history.

The effect on the packfile itself would depend on how many changes were un-changed later. For instance, if a later commit touches lines that an earlier commit also touched, the intermediate state of those lines would not be visible in history with one big commit, and thus there would be no object in the packfile to represent them.

I can't imagine that the number of commits on a branch would reduce or increase the complexity of the merge; but I cannot speak with authority on that since I haven't examined exactly how the typical merge by recursive is done. My understanding is that it is equivalent to a diff3 from the last point at which they were merged (or split). In that case, one large commit would be no more or less efficient than many small commits.

There is another aspect to consider, depending on the timing. If you are working on a branch, and regularly merging the upstream branch between your own commits, then many small commits will produce fewer conflicts because you will maintain better parity with the upstream branch, and thus deviate from it less. This will certainly cause fewer problems when it is time to merge back.

Garbage collection would be largely unaffected, since in both cases, there are no dangling commits or loose objects inherent in either method.

All in all, though, the packfiles are usually efficient enough when dealing with text that the benefit of being able to view a complete and unadulterated history often outweighs the cost of the extra space it takes.

like image 65
Greyson Avatar answered Sep 28 '22 07:09

Greyson