Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Git treat this text file as a binary file?

Tags:

git

binary

It simply means that when git inspects the actual content of the file (it doesn't know that any given extension is not a binary file - you can use the attributes file if you want to tell it explicitly - see the man pages).

Having inspected the file's contents it has seen stuff that isn't in basic ascii characters. Being UTF16 I expect that it will have 'funny' characters so it thinks it's binary.

There are ways of telling git if you have internationalisation (i18n) or extended character formats for the file. I'm not sufficiently up on the exact method for setting that - you may need to RT[Full]M ;-)

Edit: a quick search of SO found can-i-make-git-recognize-a-utf-16-file-as-text which should give you a few clues.


If you have not set the type of a file, Git tries to determine it automatically and a file with really long lines and maybe some wide characters (e.g. Unicode) is treated as binary. With the .gitattributes file you can define how Git interpretes the file. Setting the diff attribute manually lets Git interprete the file content as text and will do an usual diff.

Just add a .gitattributes to your repository root folder and set the diff attribute to the paths or files. Here's an example:

src/Acme/DemoBundle/Resources/public/js/i18n/* diff
doc/Help/NothingToSay.yml                      diff
*.css                                          diff

If you want to check if there are attributes set on a file, you can do that with the help of git check-attr

git check-attr --all -- src/my_file.txt

Another nice reference about Git attributes could be found here.


I was having this issue where Git GUI and SourceTree was treating Java/JS files as binary and thus wouldn’t show a diff.

Creating a file named attributes in .git/info with following content solved the problem:

*.java diff
*.js diff
*.pl diff
*.txt diff
*.ts diff
*.html diff
*.sh diff
*.xml diff

If you would like this to apply to all repositories, then you can add the file attributes in $HOME/.config/git/attributes.


Git will even determine that it is binary if you have one super-long line in your text file. I broke up a long String, turning it into several source code lines, and suddenly the file went from being 'binary' to a text file that I could see (in SmartGit).

So don't keep typing too far to the right without hitting 'Enter' in your editor - otherwise later on Git will think you have created a binary file.


I had this same problem after editing one of my files in a new editor. Turns out the new editor used a different encoding (Unicode) than my old editor (UTF-8). So I simply told my new editor to save my files with UTF-8 and then git showed my changes properly again and didn't see it as a binary file.

I think the problem was simply that git doesn't know how to compare files of different encoding types. So the encoding type that you use really doesn't matter, as long as it remains consistent.

I didn't test it, but I'm sure if I would have just committed my file with the new Unicode encoding, the next time I made changes to that file it would have shown the changes properly and not detected it as binary, since then it would have been comparing two Unicode encoded files, and not a UTF-8 file to a Unicode file.

You can use an app like Notepad++ to easily see and change the encoding type of a text file; Open the file in Notepad++ and use the Encoding menu in the toolbar.