<ul> <li>Some operations in Git (e.g. <code>checkout</code>) appear to assume that a commit is a snapshot or state of the working tree.</li> <li>Other operations in Git (e.g. <code>rebase</code>) appear to assume that a commit is a change: a kind of operator that can be applied to working trees.</li> </ul> So what is a Git commit, really?

<h3>Understand the Git particle/wave duality</h3> Short answer: both. Medium answer: It depends. Long answer: Git is a bit like quantum phenomena: Neither of the two views alone can explain all observations. Read on. Internally, Git will use both representations, depending (conceptually) on which one it deems more efficient in terms of storage space and execution time for a given commit. The snapshot representation is the primary one. From the user's point of view, however, it depends on what you do: <h3>Duality 1: Commit as a snapshot vs. commit as a change</h3> Indeed some commands simply only make any sense at all when you think about commits as snapshots of the working tree. This is most pronounced for <code>checkout</code>, but is also true for <code>stash</code> and at least halfway for <code>fetch</code> and <code>reset</code>. For other commands, madness is the likely result when you try to think of commits in this manner. For those other commands, commits are clearly treated as changes, <ul> <li>either in the form of patches you can look at (e.g. <code>show</code>, <code>diff</code>)</li> <li>or in the form of operators you can apply to modify your working tree (e.g. <code>apply</code>, <code>cherry-pick</code>, <code>pull</code>)</li> <li>or in the form of operators you can apply to modify other commits (e.g. <code>rebase</code>)</li> <li>or in the form of operators you can apply to create new commits (e.g. <code>merge</code>, <code>cherry-pick</code>)</li> </ul> <h3>Duality 2: Commit as a fixed thing vs. commit as something fluid</h3> There is a side-effect of duality 1 that can shock Git newbies accustomed to other versioning systems. It is the fact that Git appears to not even commit itself to its commits. Huh? Assume you have created a branch X containing what you like to think of as your commits <code>A</code> and <code>B</code>. But <code>master</code> has progressed a little, so you <code>rebase</code> X to <code>master</code>. When you think of <code>A</code> and <code>B</code> as changes, but of <code>master</code> as a snapshot (hey, particles and waves in a single experiment!), this is not a problem: Just apply the changes <code>A</code> and <code>B</code> to the snapshot <code>master</code>. This thinking is so natural that you will barely notice that Git has now rewritten your commits <code>A</code> and <code>B</code>: They now have different snapshot content and hence a different SHA-1 ID. In Git, the conceptual commit that you think of as a developer is not a fixed-for-all-times kind of thing, but rather some fluid object that changes as a result of working with your repository. In contrast, if you think of all three (<code>A</code>, <code>B</code>, and <code>master</code>) as snapshots or of all three as changes, your brain will hurt and you will get nowhere. <h3>Disclaimer</h3> The above is a much-simplified description. In Git reality, <ul> <li>a commit is not a snapshot at all, it is a piece of metadata (the who/when/why of a snapshot) plus a pointer to a snapshot;</li> <li>the snapshot is called a tree in Git lingo;</li> <li>the commits-as-changes internal representation uses packfiles;</li> <li>some of the above-mentioned commands have further roles that do not fit the same characterization;</li> <li>and even for the given roles it is to some degree a matter of taste into which category (or -ies) certain commands belong.</li> </ul> And don't get confused by the fact that the Pro Git book's very first characterization of Git (in section "Git Basics") is "Snapshots, Not Differences". Git is complicated after all.

A commit in Git: Is it a snapshot/state/image or is it a change/diff/patch/delta?

1 Answers

Understand the Git particle/wave duality

Short answer: both.

Medium answer: It depends.

Long answer: Git is a bit like quantum phenomena: Neither of the two views alone can explain all observations. Read on.

Internally, Git will use both representations, depending (conceptually) on which one it deems more efficient in terms of storage space and execution time for a given commit. The snapshot representation is the primary one.

From the user's point of view, however, it depends on what you do:

Duality 1: Commit as a snapshot vs. commit as a change

Indeed some commands simply only make any sense at all when you think about commits as snapshots of the working tree. This is most pronounced for checkout, but is also true for stash and at least halfway for fetch and reset.

For other commands, madness is the likely result when you try to think of commits in this manner. For those other commands, commits are clearly treated as changes,

either in the form of patches you can look at (e.g. show, diff)
or in the form of operators you can apply to modify your working tree (e.g. apply, cherry-pick, pull)
or in the form of operators you can apply to modify other commits (e.g. rebase)
or in the form of operators you can apply to create new commits (e.g. merge, cherry-pick)

Duality 2: Commit as a fixed thing vs. commit as something fluid

There is a side-effect of duality 1 that can shock Git newbies accustomed to other versioning systems. It is the fact that Git appears to not even commit itself to its commits.

Huh?

Assume you have created a branch X containing what you like to think of as your commits A and B. But master has progressed a little, so you rebase X to master.

When you think of A and B as changes, but of master as a snapshot (hey, particles and waves in a single experiment!), this is not a problem: Just apply the changes A and B to the snapshot master.

This thinking is so natural that you will barely notice that Git has now rewritten your commits A and B: They now have different snapshot content and hence a different SHA-1 ID. In Git, the conceptual commit that you think of as a developer is not a fixed-for-all-times kind of thing, but rather some fluid object that changes as a result of working with your repository.

In contrast, if you think of all three (A, B, and master) as snapshots or of all three as changes, your brain will hurt and you will get nowhere.

Disclaimer

The above is a much-simplified description. In Git reality,

a commit is not a snapshot at all, it is a piece of metadata (the who/when/why of a snapshot) plus a pointer to a snapshot;
the snapshot is called a tree in Git lingo;
the commits-as-changes internal representation uses packfiles;
some of the above-mentioned commands have further roles that do not fit the same characterization;
and even for the given roles it is to some degree a matter of taste into which category (or -ies) certain commands belong.

And don't get confused by the fact that the Pro Git book's very first characterization of Git (in section "Git Basics") is "Snapshots, Not Differences".

Git is complicated after all.

173

answered Oct 21 '22 21:10

Lutz Prechelt

Related questions
                            
                                git: osxkeychain credential helper silently fails to remember username/password
                            
                                How do I turn off git autocorrect?
                            
                                how to get git log -p to show changes in merge commits
                            
                                Prompting for username/password with git clone from Dockerfile run step
                            
                                git: can I subtree merge just a subpath of a repository?
                            
                                A Git Source Control strategy for a live Sitecore website
                            
                                How can I do the equivalent of git rebase -i HEAD~2 in eGit?
                            
                                Gerrit push to refs/for/master prohibited
                            
                                Github restricting access
                            
                                Git: best way to remove all changes from a given file for one branch
                            
                                Duplicated folders in github
                            
                                File ownership/group is changed when users push to a GIT repository
                            
                                git: Patch does not have a valid e-mail address
                            
                                How can I untrack files in Git according to my .gitignore file?
                            
                                Making git output full (un-abbreviated) hashes for all commands?
                            
                                How do I force Git NOT to prompt for credentials
                            
                                git config --global core.filemode false does not work with git diff
                            
                                How can I set a submodule to point to a specific commit without fetching it?
                            
                                Smart Git not showing local changes
                            
                                Cannot use remote repository in a composer on windows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

A commit in Git: Is it a snapshot/state/image or is it a change/diff/patch/delta?

Tags:

git

Lutz Prechelt

People also ask

1 Answers

Understand the Git particle/wave duality

Duality 1: Commit as a snapshot vs. commit as a change

Duality 2: Commit as a fixed thing vs. commit as something fluid

Disclaimer

Lutz Prechelt

Recent Activity

Donate For Us