So I've been reading lately about how to setup a git server, and upon finding that there is no specific daemon needed at all (just an SSH server with a filesystem behind it), I started to look more on how git manages files under the hood.
The strategy of how each commit is represented inside the .objects folder and how everything fits together is quite clever, but it doesn't seem to be mentioned explicitly that this approach actually makes git achieve concurrency in a very simple way without the need of a signaling server.
Nonetheless, there are situations in which concurrency cannot be guaranteed, which is basically when history is re-written (forced pushes). In this case, is there any locking strategy used in the tree to avoid concurrency issues? Is there any more documentation on this topic out there?
(Something is said about this topic in this SO answer, but just very briefly.)
The git data structures are immutable, except refs (i.e. branches/tags/etc), and "rewriting history" is not very correct term, more appropriate "creating alternative history". The repo will have all objects - new and old. Moreover, all the changes created in a local repository, during "push" objects are just transferred. Then you push it, it sends all objects first (and because objects are defined by its content, they are unique, there is no concurrency problem). After all objects are sent, a reference is changing. It is just a tiny single file (refs/heads/<branchName>
) to override with 40 bytes sha1 key. As I know it does atomic Compare-and-Set change of the file. It reads old ref value, creates a lock file, checks if old value is unchanged, replaces with new sha1 and deletes the lock. If it fails, the push fails and you need to retry (i.e. optimistic lock). You could figure out more details from source code, update_ref function.
After force-push some "loose objects" could appear (i.e. objects which are not referenced from any existing ref), so these objects are garbage collected later.
Very clever and neat.
Various files are created where necessary to acts as locks. Git creates a file called .git/index.lock
to lock the index. git index-pack
can create a .keep
file to prevent a race condition. There may be more examples.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With