Is git push
with several commits an atomic operation?
git push
operations to the same branchgit pull
operations from the same branchFor case 1. it has to be. Otherwise my commits would interfere with someone else's commits, possibly creating an inconsistent or invalid state. Git prevents that by either forcing me to integrate someone else's changes first (if I lose the race) or forcing someone else to integrate my changes (if I win the race).
But what about 2.? If my repository looks like this:
C---D---E master
/
A---B origin/master
Is anyone doing a git pull
while I am doing git push
going to see either A---B or A---B---C---D---E, or can they also get anything in between, e.g. A---B---C---D?
The --atomic option guarantees that either all three references will be updated on the remote, or none of them will. Please note that git push --atomic is still somewhat experimental, and it is possible to experience partial updates if you try to push something unusual.
The answer is a very loud no: it's not atomic in any way. Individual files written in the work-tree are written one at a time, using OS-level write calls, which are not atomic.
Git is known to have atomic operations i.e. an environment issue will not cause git to commit half of the files into the repository and leave the rest.
Understand the usage and impact of this popular Git command on your project, learn new safer alternatives, and grasp the skills of restoring a broken branch. Most know that using Git's push --force command is strongly discouraged and is considered destructive.
Effectively, yes.
Note that you have zero control over what anyone else does with their repository. But while you're doing a git push
to some other repository (such as one over on GitHub), what really happens is:
Your Git sends over any commits and/or other objects their Git requires, in order for your Git to make its create-or-update-or-delete request(s). A name can only name some actual object stored in a repository, so for you to ask them to set their master
branch to commit a123456...
, your Git must first ensure that they have commit a123456...
.
Then, for each name you'd like them to update (or create or delete), your Git asks (regular push) or commands (git push --force
and other operations that set the force flag) them to make the update. You send them names N and hashes new-hash, as a list of update (or create or delete) requests. Each request has one, or sometimes two as below, hashes given. (An all-zero hash means "delete".)
Your Git can send them a polite request, which their Git will obey if it's a new branch or tag, or if it's a delete request, or if it's a branch name update and the update is a fast-forward. (Besides these constraints, whoever controls their Git may set whatever additional constraints they like, but these are the defaults.)
Your Git can send a command with no conditions. By default, their Git will obey (but as before whoever controls their Git can set additional constraints).
Or, your Git can send a command, but with your own condition, of the form: I believe your name N represents hash ID old-H (for some name and hash, with old-H being all-zeros if you expect them to not have the name yet). Their Git will obey the command if their name N has hash old-H (and as before whoever controls their Git can set additional constraints).
This updating procedure occurs under a lock that their Git sets in their repository. This lock makes the update all-or-nothing, as far as your Git is concerned. For each name you send, the update either happens—is accepted, and now their name N represents the new-hash your Git asked / commanded—or doesn't and is rejected and the name has not changed.
When you (or anyone) run git pull
you're really running git fetch
followed by a second, purely local, Git command. The git fetch
is similar to git push
in that your Git calls up some other Git, but this time the data transfer goes the other way:
Your Git gets a listing, from their Git, of all their names and hash IDs. If there's an ongoing push, each pair—name and hash ID—is either from before a requested or commanded update, or from after: there is no in-between visible because their Git respects their own locks.
Then, using the names and hash IDs found in this step, your Git brings over new objects you want and don't have based on this listing.
At the end of this process, your Git doesn't touch any of your branch names—at least not by default (you can override this with refspec arguments). Instead, your Git updates your remote-tracking names, such as origin/master
, to match their names. (Depending on how you run git fetch
, you can constrain your Git to update only one or a few of your names, rather than all of them; if you're only going to update your origin/master
, your Git can skip downloading new objects that are only reachable from their feature-X
that would become your origin/feature-X
.)
The second, purely-local command can do whatever that second command (usually merge unless you select rebase) can do. This part is often not atomic: e.g., during a rebase, your rebase may stop in the middle with only some commits copied, forcing you to fix the conflict and run git rebase --continue
. But this is all in your repository, which no one else shares. (Your Git also does its own lock/unlock operations across your own branch-name and other-name updates, in case you're running another Git command in the background, or via a cron job, or whatever.)
Your CI system will, in general, have its own Git repository that it updates by copying from whichever repository you designated as its upstream (e.g., a GitHub one). Your CI system will run git fetch
to get its origin/master
updated. How your CI system goes about checking out and building that origin/master
commit is up to it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With