Possible Duplicate:
Why does Mercurial think my SQL files are binary?
I generated a complete set of scripts for the stored procedures in a database. When I created a Mercurial repository and added these files they were all added as binary. Obviously, I still get the benefits of versioning, but lose a lot of efficiency, 'diff'ing, etc... of text files. I verified that these files are indeed all just text.
Why is it doing this?
What can I do to avoid it?
IS there a way to get Hg to change it mind about these files?
Here is a snippet of changeset log:
496.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFindCustomerByMatchCode.StoredProcedure.sql has changed
497.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFindUnreconcilableChecks.StoredProcedure.sql has changed
498.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFixBadLabelSelected.StoredProcedure.sql has changed
499.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFixCCOPL.StoredProcedure.sql has changed
500.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFixCCOrderMoneyError.StoredProcedure.sql has changed
Thanks in advance for your help Jim
In fitting with Mercurial's views on binary files, it does not actually track file types, which means that there is no way for a user to mark a file as binary or not binary.
As tonfa and Rudi mentioned, Mercurial determines whether a file is binary or not by seeing if there is a NUL byte anywhere in the file. In the case of UTF-[16|32] files, a NUL byte is pretty much guaranteed.
To "fix" this, you would have to ensure that the files are encoded with UTF-8 instead of UTF-16. Ideally, your database would have a setting for Unicode encoding when doing the export. If that's not the case, another option would be to write a precommit hook to do it (see How to convert a file to UTF-8 in Python for a start), but you would have to be very careful about which files you were converting.
I know it's a bit late, but I was evaluating Kiln and came across this problem. After discussion with the guys at Fogbugz who couldn't give me an answer other than "File/Save As" from SSMS for every *.sql file (very tedious), I decided to have a look at writing a quick script to convert the *.sql files.
Fortunately you can use one Microsoft technology (Powershell) to (sort of) overcome an issue with another Microsoft technology (SSMS) - using Powershell, change to the directory that contains your *.sql files and then copy and paste the following into the Powershell shell (or save as a .ps1 script and run it from Powershell - make sure to run the command "Set-ExecutionPolicy RemoteSigned" before trying to run a .ps1 script):
function Get-FileEncoding
{
[CmdletBinding()] Param (
[Parameter(Mandatory = $True, ValueFromPipelineByPropertyName = $True)] [string]$Path
)
[byte[]]$byte = get-content -Encoding byte -ReadCount 4 -TotalCount 4 -Path $Path
if ( $byte[0] -eq 0xef -and $byte[1] -eq 0xbb -and $byte[2] -eq 0xbf )
{ Write-Output 'UTF8' }
elseif ($byte[0] -eq 0xfe -and $byte[1] -eq 0xff)
{ Write-Output 'Unicode' }
elseif ($byte[0] -eq 0xff -and $byte[1] -eq 0xfe)
{ Write-Output 'Unicode' }
elseif ($byte[0] -eq 0 -and $byte[1] -eq 0 -and $byte[2] -eq 0xfe -and $byte[3] -eq 0xff)
{ Write-Output 'UTF32' }
elseif ($byte[0] -eq 0x2b -and $byte[1] -eq 0x2f -and $byte[2] -eq 0x76)
{ Write-Output 'UTF7'}
else
{ Write-Output 'ASCII' }
}
$files = get-ChildItem "*.sql"
foreach ( $file in $files )
{
$encoding = Get-FileEncoding $file
If ($encoding -eq 'Unicode')
{
(Get-Content "$file" -Encoding Unicode) | Set-Content -Encoding UTF8 "$file"
}
}
The function Get-FileEncoding is courtesy of http://poshcode.org/3227 although I had to modify it slightly to cater for UC2 little endian files which SSMS seems to have saved these as. I would recommend backing up your files first as it overwrites the original - you could, of course, modify the script so that it saves a UTF-8 version of the file instead e.g. change the last line of code to say:
(Get-Content "$file" -Encoding Unicode) | Set-Content -Encoding UTF8 "$file.new"
The script should be easy to modify to traverse subdirectories as well.
Now you just need to remember to run this if there are any new *.sql files, before you commit and push your changes. Any files already converted and subsequently opened in SSMS will stay as UTF-8 when saved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With