Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Locking strategy of git to achieve concurrency?

So I've been reading lately about how to setup a git server, and upon finding that there is no specific daemon needed at all (just an SSH server with a filesystem behind it), I started to look more on how git manages files under the hood.

The strategy of how each commit is represented inside the .objects folder and how everything fits together is quite clever, but it doesn't seem to be mentioned explicitly that this approach actually makes git achieve concurrency in a very simple way without the need of a signaling server.

Nonetheless, there are situations in which concurrency cannot be guaranteed, which is basically when history is re-written (forced pushes). In this case, is there any locking strategy used in the tree to avoid concurrency issues? Is there any more documentation on this topic out there?

(Something is said about this topic in this SO answer, but just very briefly.)

like image 460
knocte Avatar asked Nov 13 '13 18:11

knocte


2 Answers

The git data structures are immutable, except refs (i.e. branches/tags/etc), and "rewriting history" is not very correct term, more appropriate "creating alternative history". The repo will have all objects - new and old. Moreover, all the changes created in a local repository, during "push" objects are just transferred. Then you push it, it sends all objects first (and because objects are defined by its content, they are unique, there is no concurrency problem). After all objects are sent, a reference is changing. It is just a tiny single file (refs/heads/<branchName>) to override with 40 bytes sha1 key. As I know it does atomic Compare-and-Set change of the file. It reads old ref value, creates a lock file, checks if old value is unchanged, replaces with new sha1 and deletes the lock. If it fails, the push fails and you need to retry (i.e. optimistic lock). You could figure out more details from source code, update_ref function.

After force-push some "loose objects" could appear (i.e. objects which are not referenced from any existing ref), so these objects are garbage collected later.

Very clever and neat.

like image 137
kan Avatar answered Nov 15 '22 20:11

kan


Various files are created where necessary to acts as locks. Git creates a file called .git/index.lock to lock the index. git index-pack can create a .keep file to prevent a race condition. There may be more examples.

like image 26
Robin Green Avatar answered Nov 15 '22 20:11

Robin Green