Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does 'git status' ignore the .gitattributes clean filter?

I have a .gitattributes clean filter to remove all comments from a file before committing.

$ cat .git/config
[filter "cleancomments"]
    clean = "grep -v '^#'"

$ cat .gitattributes
*   filter=cleancomments

And I have a file 'test' with following content (committed in the repository):

This is a file with random content

Now I make a modification to 'test' and add comments:

This is a file with random content
# and some comments
# like this

git status now tells me:

modified:   test

but git diff is empty (as it should).

It is not completely clear to me why git status does not use the filter to decide if a file has been modified or not, but I assume this is how it is implemented.

What is really mysterious to me is following:

If I do this:

git add test

Then suddenly the file 'test' is no longer marked as modified and it does not appear in the git index. Why is this?

like image 772
Omar Kohl Avatar asked Nov 06 '13 09:11

Omar Kohl


1 Answers

git add adds the file to the index,1 but runs it through any required filters first.

The index contains the file's on-disk name and "true name" (its git hash as a "blob") along with directory stat values and a pair of git hash values (original and filtered), plus some other bits and bobs as needed. Once add-ed, git status can tell from the index data that the file is now "up to date" in the index, and the index itself is up to date in the repository as the blob's hash matches the HEAD commit hash.

If you modify the file some more, though, some key stat data changes, making git think the index is out of date, and git status will once again think it needs to be git add-ed.2

The general idea here seems to be that git status does not write anything (even the index). It might be nice if git update-index --refresh would update the work-dir/cleaned-entry pairing, but it does not seem to.


1More precisely, git add computes the hash—and hence the "true name" in the repo—and then adds the object to the repo if and only if it's not already present. The hash value is now known and can be stored in the index as needed. The hash value is unknown until after doing the filtering and hashing, i.e., git status doesn't know it.

2There are more subtleties here if you use things like --assume-unchanged and/or core.ignorestat.

like image 106
torek Avatar answered Sep 20 '22 12:09

torek