checkout
) appear to assume that a commit is a snapshot or state of the working tree.rebase
) appear to assume that a commit is a change: a kind of operator that can be applied to working trees.So what is a Git commit, really?
Commits are snapshots, not diffs.
The term snapshot is used in the git reference site as well. It is the replacement term for "Revision". In other version control systems, changes to individual files are tracked and refered to as revisions, but with git you are tracking the entire workspace, so they use the term snapshot to denote the difference.
When you commit, git stores snapshots of the entire file, it does not store diffs from the previous commit. As a repository grows, the object count grows exponentially and clearly it becomes inefficient to store the data as loose object files.
The git commit command captures a snapshot of the project's currently staged changes. Committed snapshots can be thought of as “safe” versions of a project—Git will never change them unless you explicitly ask it to.
Short answer: both.
Medium answer: It depends.
Long answer: Git is a bit like quantum phenomena: Neither of the two views alone can explain all observations. Read on.
Internally, Git will use both representations, depending (conceptually) on which one it deems more efficient in terms of storage space and execution time for a given commit. The snapshot representation is the primary one.
From the user's point of view, however, it depends on what you do:
Indeed some commands simply only make any sense at all when you
think about commits as snapshots of the working tree.
This is most pronounced for checkout
, but is also true for
stash
and at least halfway for fetch
and reset
.
For other commands, madness is the likely result when you try to think of commits in this manner. For those other commands, commits are clearly treated as changes,
show
, diff
)apply
, cherry-pick
, pull
)rebase
)merge
, cherry-pick
)There is a side-effect of duality 1 that can shock Git newbies accustomed to other versioning systems. It is the fact that Git appears to not even commit itself to its commits.
Huh?
Assume you have created a branch X containing what you like to think
of as your commits A
and B
.
But master
has progressed a little, so you rebase
X to master
.
When you think of A
and B
as changes, but of master
as a snapshot
(hey, particles and waves in a single experiment!),
this is not a problem:
Just apply the changes A
and B
to the snapshot master
.
This thinking is so natural that you will barely notice that Git
has now rewritten your commits A
and B
: They now have different
snapshot content and hence a different SHA-1 ID.
In Git, the conceptual commit that you think of as a developer
is not a fixed-for-all-times kind of thing, but rather
some fluid object that changes as a result of working with your
repository.
In contrast, if you think of all three (A
, B
, and master
)
as snapshots or of all three as changes,
your brain will hurt and you will get nowhere.
The above is a much-simplified description. In Git reality,
And don't get confused by the fact that the Pro Git book's very first characterization of Git (in section "Git Basics") is "Snapshots, Not Differences".
Git is complicated after all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With