Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is git good with binary files? [closed]

Tags:

git

People also ask

Should I commit binary files Git?

It's important to never commit binary files because once you've commit them they are in the repository history and are very annoying to remove. You can delete the files from the current version of the project - but they'll remain in the repository history, meaning that the overall repository size will still be large.

Does GIT differ from binary files?

Any binary format can be diffed with git, as long as there's a tool which converts the binary format to plain text. One just needs to add the conversion handlers and attributes in the same way.

Are binary files readable?

Any formatted or unformatted binary data is stored in a binary file, and this file is not human-readable and is used by the computer directly. When a binary file is required to read or transfer from one location to another location, the file's content is converted or encoded into a human-readable format.

Do binary files have EOL?

Answer: It cannot.


Out of the box, git can easily add binary files to its index, and also store them in an efficient way unless you do frequent updates on large uncompressable files.

The problems begin when git needs to generate diffs and merges: git cannot generate meaningful diffs, or merge binary files in any way that could make sense. So all merges, rebases or cherrypicks involving a change to a binary file will involve you making a manual conflict resolution on that binary file.

You need to decide whether the binary file changes are rare enough that you can live with the extra manual work they cause in the normal git workflow involving merges, rebases, cherrypicks.


In addition to other answers.

  • You can send a diff to binary file using so called binary diff format. It is not human-readable, and it can only be applied if you have exact preimage in your repository, i.e. without any fuzz.
    An example:

    diff --git a/gitweb/git-favicon.png b/gitweb/git-favicon.png
    index de637c0608090162a6ce6b51d5f9bfe512cf8bcf..aae35a70e70351fe6dcb3e905e2e388cf0cb0ac3 100
    GIT binary patch
    delta 85
    zcmZ3&SUf?+pEJNG#Pt9J149GD|NsBH{?u>)*{Yr{jv*Y^lOtGJcy4sCvGS>LGzvuT
    nGSco!%*slUXkjQ0+{(x>@rZKt$^5c~Kn)C@u6{1-oD!M<s|Fj6
    
    delta 135
    zcmXS3!Z<;to+rR3#Pt9J149GDe=s<ftM(tr<t*@sEM{Qf76xHPhFNnYfP!|OE{-7;
    zjI0MY3OYE5upapO?DR{I1pyyR7cx(jY7y^{FfMCvb5IaiQM`NJfeQjFwttKJyJNq@
    hveI=@x=fAo=hV3$-MIWu9%vGSr>mdKI;RB2CICA_GnfDX
    
  • You can use textconv gitattribute to have git diff show human-readable diff for binary files, or parts of binary files. For example for *.jpg files it can be difference in EXIF information, for PDF files it can be difference between their text representation (pdf2text or something like that).

HTH.


If you've got really large binary files, you can use git-annex to store the data outside of the repository. Check out: http://git-annex.branchable.com/


Well git is good with binaries. But it won't handle binaries like text files. It's like you want to merge binary files. I mean, a diff on a jpeg will never return you anything. Git works very well with text file and probably as bad as every other solution with binary files!


if you want a solution for versioning you might wanna consider git-lfs that has a lightweight pointer to your file.

it means when you clone your repo it doesnt download all the versions but only the one that is checked-out.

Here's a nice tutorial of how to use it