Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Mercurial think my SQL files are binary?

I just scripted out my SQL Server stored procs, table definitions, etc using SQL Server Management Studio, and tried to add them to my Mercurial source control repository. They got added just fine, but now when I change and diff them, Mercurial calls them "binary files" and doesn't give me a proper unified diff.

I thought the encoding might be a problem, so I tried regenerating the scripts and specifying ANSI for the text file output, but I get the same behavior. I can view them just fine in notepad without any odd-looking characters showing up. Why does Mercurial think these files are binary?

Otherwise, if someone can recommend a good tool for scripting out a SQL Server database that might not cause this issue, that might work, too.

like image 568
Brian Sullivan Avatar asked Mar 02 '10 20:03

Brian Sullivan


People also ask

Why does git think my .SQL file is a binary file?

"Why is Git marking my file as binary?" The answer is because it's seeing a NUL (0) byte somewhere within the first 8000 characters of the file.

Are .SQL files binary?

sql file is a binary file.

What is binary file in SQL?

In SQL, binary data types are used to store any kind of binary data like images, word files, text files, etc. in the table. In binary data types, we have an option like allowing users to store fixed-length or variable length of bytes based on requirements.


4 Answers

I've run into this problem because SQL Server Management Studio saves the files as Unicode. The first two bytes (most of the time) of a Unicode text file define the encoding. Most newer text editors (e.g. Notepad) handle this transparently.

The first two bytes are probably where your problem is. They may look like ÿþ. Or FF FE in hex.

On the "Save" button on the Save dialog is a pick list. Choose "Save with Encoding..." and select "US-ASCII-Codepage20127". I believe this setting is sticky and will remain for future saves.

like image 109
Darryl Peterson Avatar answered Oct 20 '22 21:10

Darryl Peterson


According to the docs, it's considered binary iff there are null bytes in the file. SQL files shouldn't have null bytes, so I would check that first (try looking in a hex editor). I assume you do know you can force diff to treat it as text

like image 34
Matthew Flaschen Avatar answered Oct 20 '22 21:10

Matthew Flaschen


Andrew is right; it's a NUL byte somewhere (my guess would be a Byte Order Mark at the start inserted by a rude editor tool). Don't worry about it though, unlike SVN or CVS Mercurial doesn't handle binary vs. text differently at all. It displays them different when you do 'hg log', but they're not handled at all differently.

Upcoming mercurial releases special case BOMs and don't let them trigger the "user probably doesn't want to see a diff of this on console" behavior.

like image 40
Ry4an Brase Avatar answered Oct 20 '22 21:10

Ry4an Brase


I ran into this when editing a file of stored procedures from SQL Server on linux and using git. Git thought it was a binary file because the file from SQL Server was UTF-16, and therefore contained NULs. My fix for this was emacs, which lets you change the encoding to UTF-8.

like image 36
themis Avatar answered Oct 20 '22 21:10

themis