Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Display text diffs for binary files on GitHub

I'm trying to use Git and GitHub to sync a number of app configuration files. These are XML or plist files stored in a binary format. For example, a Keyboard Maestro .kmsync file.

I can open these files via a text editor to see an XML format.

But when I view these file diffs in a GitHub Pull Request, commit view, etc. I see a useless binary diff with no visible changes:

Showing with 0 additions and 0 deletions.
BIN +17 Bytes (100%)
Binary file not shown.

I can get the a text-based diff to display locally via git via a .gitattributes file. However, it appears that GitHub doesn't respect these modifications:

GitHub doesn't use .gitattributes files for choosing which files to show in a diff, so it's not possible to get around this that way. [source]

I want to see the text-based changes and line diffs when I view these files on GitHub in my commits and Pull Requests.

For example, the GitHub PR here. Feel free to fork and experiment:
https://github.com/pkamb/so/pull/1

How can I convince the web view of a GitHub repo to use text-based diffing for certain "binary" files?


I cannot find an existing question for my specific ask (displaying a non-binary diff on GitHub).

The following questions relate to for this same behavior, but for local git (not GitHub).

  • Override git's choice of binary file to text
  • How would you put an AppleScript script under version control?

My question is the opposite of this question, which seeks to display text files as binary files on GitHub:

  • Make github use .gitattributes "binary" attribute
like image 449
pkamb Avatar asked Oct 10 '20 19:10

pkamb


People also ask

Can Git compare binary files?

Any binary format can be diffed with git, as long as there's a tool which converts the binary format to plain text. One just needs to add the conversion handlers and attributes in the same way.

How do I view changes side by side in GitHub?

You're taken to a page that shows the diffs as inline or unified for the file. Fortunately, there's a split button in the upper right hand corner that says Unified | Split. Clicking on Split portion of the button will show the before and after changes side by side, which is just my personal preference.


1 Answers

There isn't a way to force GitHub to display these files as text because they are not. When GitHub renders files as part of an HTML page, they must be in some encoding, and the only reasonable choice for encodings these days is UTF-8. These files cannot be displayed as-is as UTF-8 because they contain byte sequences that are not valid in UTF-8, in addition to control characters, which generally cannot be rendered well in a web page.

It is possible to convert these files to text for diffing using a .gitattributes file using the diff type and the diff.*.textconv attribute in your config file. This works great on your machine, but it won't work on GitHub. First of all, GitHub doesn't have your tool for rendering files, and secondly, GitHub doesn't support external programs for rendering files in general, mostly for security reasons. Some common formats are supported, but this is not one of them.

Also note that the program to be used is stored in the Git configuration and not in the .gitattributes file; this is intentional, since shipping a list of programs to execute in the repository is a security problem. Therefore, GitHub can't possibly even know the program you'd be using here.

If your kmsync files have a plain text equivalent that you can compile into the binary format, then you can store that format in the repository and build it as part of a build step. That will be diffable and will still provide the binary formats that you can use for your project. This is no different than compiling code into binaries or plain text into PDFs.

like image 109
bk2204 Avatar answered Sep 21 '22 05:09

bk2204