Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git (LFS): what is locking support? And should I enable it?

Tags:

git

git-lfs

"New" Git Comment:

Just today I ran across the following comment from Git for the first time (at least the first time I saw it):

Mikes-Mac$ git push
Locking support detected on remote "origin". Consider enabling it with:
  $ git config 'lfs.https://github.com/<my_repo>.git/info/lfs.locksverify' true
Everything up-to-date
Mikes-Mac$ 

What is this Locking support? Is this some sort of mutex locking for the LFS (large file storage)? If so, isn't it absolutely essential to get anything on git to work? (Minimally, how else could the "ordering" of the log history be established? Worse case, couldn't I have a binary file corrupted by simultaneous writes?)

My Actions

I didn't do anything differently to this repository recently, nor have I done anything differently with this repository compared to any others that I've established with LFS.

I'm therefore assuming this is a new comment being provided to "the world" to let us know of new features.

No Obvious Documentation

However, neither a Google search nor a quick search through their documentation led me to anything to explain this. So, I'm left wondering:

  • What is this locking?
    • Is it mutex? If so, how could my repo even function without it?
    • Is this limited to just LFS? How is it different from normal git file locking?
  • What are the pros and cons of adding locking support for LFS?
like image 967
Mike Williamson Avatar asked Mar 04 '17 14:03

Mike Williamson


People also ask

What is LFS locking support?

This section talks about how to use the Git LFS (Large File Support) locking feature. File locking allows developers to lock files they are updating to prevent other users from updating them at the same time.

Is git LFS necessary?

You should use Git LFS if you have large files or binary files to store in Git repositories. That's because Git is decentralized. So, every developer has the full change history on their computer.

What is git LFS and how do you use it?

Git LFS (Large File Storage) is a Git extension developed by Atlassian, GitHub, and a few other open source contributors, that reduces the impact of large files in your repository by downloading the relevant versions of them lazily.


2 Answers

Locking support of Git LFS is documented here https://github.com/git-lfs/git-lfs/wiki/File-Locking.

Git LFS v2.0.0 includes an early release of File Locking. File Locking lets developers lock files they are updating to prevent other users from updating them at the same time. Concurrent edits in Git repositories will lead to merge conflicts, which are very difficult to resolve in large binary files.

Once file patterns in .gitattributes are lockable, Git LFS will make them readonly on the local file system automatically. This prevents users from accidentally editing a file without locking it first.

Git LFS will verify that you're not modifying a file locked by another user when pushing. Since File Locking is an early release, and few LFS servers implement the API, Git LFS won't halt your push if it cannot verify locked files. You'll see a message like this:

$ git lfs push origin master --all
Remote "origin" does not support the LFS locking API. Consider disabling it with:
  $ git config 'lfs.http://git-server.com/user/test.locksverify' false
Git LFS: (0 of 0 files, 7 skipped) 0 B / 0 B, 879.11 KB skipped
 
$ git lfs push origin master --all
Locking support detected on remote "origin". Consider enabling it with:
  $ git config 'lfs.http://git-server.com/user/repo.locksverify' true
Git LFS: (0 of 0 files, 7 skipped) 0 B / 0 B, 879.11 KB skipped

So in some sense you may consider it an advisory mutex, because:

  • If you don't "lock" the file, you can't edit it
  • Once you "lock" the file with git lfs lock, you can edit it, and the repository server will recognize that you are editing it
  • The server will not accept commits from other people that change the files you have locked.

It is mainly added to help team managing large files to prevent merge conflicts.

like image 54
kennytm Avatar answered Oct 06 '22 21:10

kennytm


The accepted answer fails to answer a secondary aspect of the question, specifically:

If so, isn't it absolutely essential to get anything on git to work? (Minimally, how else could the "ordering" of the log history be established? Worse case, couldn't I have a binary file corrupted by simultaneous writes?)

So I'll address myself to just that portions.

There are no simultaneous writes in git, and the ordering of history is part of the answer why!

When you push a commit to a remote git repository it verifies that the commit that you are pushing is a descendant of what the repository already has as the head of that branch.

If not, the commit is rejected.

If it is, then git will send over a set of blobs (file data), tree objects, and commits.

It then updates the branch head to point to your new commit.

Except if the head changed, it will once again, reject your new commit.

If rejected, you have to pull in the newer changes from the remote repository, and either merge them with your changes, or rebase your changes on top of the new ones (e.g. git pull -r).

Either way, you create a new local commit that is a descendant of what the repository has.

You can then push this new commit to the repository. It is possible for this new commit then to be rejected, forcing you to repeat the process.

Files are never overwritten. The file "mybigfile.mpg" to git is just a blob with a unique ID based on a SHA-256 hash of the file contents. If you change the file, that's a new blob with a new ID.

The filename, that is just an entry in a tree object. These also have IDs based on a hash of their contents.

If you rename a file (or add, remove, etc.) that's a new tree object, with it's own ID.

Because these are part of the history (a commit includes the id of the top tree being committed, as well as IDs of its parents), these objects (blobs, trees, commits, signed tags) are only ever added to the repository, and never modified.

Git history is always ordered, because it is a linked list, with links pointing to the parents. Zero commits for the initial commit, two or more for merge commits, and one otherwise.

And they don't require any explicit locking, because git checks for conflicts before making a change.

Only the file that contains the ID of the head commit on a branch needs to be locked, and then only for a brief moment between checking for changes and updating it.

Locks in git-lfs address a very different problem. Binary assets usually cannot be merged, and often involve large amounts of work in changing them.

This is especially true of large assets.

If two developers start making changes to the same file, one set of changes will have to be discarded, and then recreated starting with the other change as a base.

git-lfs locking prevents this from happening by accident. If you encounter a lock, you either wait until later to make your changes, or you go talk to the person who has that lock.

Either they can make the requested change, or they can release the lock and allow you to make your change on top of their changes so far. Then when you're done, you can push your change and release the lock, allowing them to continue with your change.

Thus, it allows changes (the entire developer process, not file-writing) for specific files to be serialized, rather than the parallel-then-merge paradigm used for textual source files.

like image 31
Bob Kerns Avatar answered Oct 06 '22 22:10

Bob Kerns