I have a bunch of UTF-8 encoded flat files that need to be imported into a SQL Server 2008 R2 database. Bulk inserts are not able to identify the diameters nor seems to accept UTF-8.
I understand that there is a number of articles on how SQL Server 2008 deals with UTF-8 encoding, but I'm sort of looking for any updated answers as most of those articles are old.
Is there anything I can to do in order to get these flat files into the database either by converting them before an insert or a process to run during the insert?
I want to stay away from manually converting each one. Furthermore, SSIS packages that I've attempted to create can read and separate the data. It just can't move the data it seems. :(
The flat files are generated by Java. Converting the java environment from UTF-8 to any other encoding has been unsuccessful.
NOTE
I have no intention of storing UTF-8 data. My delimiter is coming in funky because it's UTF-8. SQL Server cannot read the characters when separating the columns and rows. That's it.
To access the Import Flat File Wizard, follow these steps: Open SQL Server Management Studio. Connect to an instance of the SQL Server Database Engine or localhost. Expand Databases, right-click a database (test in the example below), point to Tasks, and click Import Flat File above Import Data.
SQL Server 2019 (15. x) introduces an additional option for UTF-8 encoding.
Not true, you simply need to choose code page 65001
Just for reference, if someone google it, and falls here like me.
I've tried the accepted answer a dozen times, with no success. In my case, my data file was a .csv
flat file, which had a lot of accents characters/letters, like ç é ã á
.
I also noted that no matter what encoding I choose, the import was made using the 1251 (ANSI - Latin 1) encoding.
So, the solution was convert before import, my .csv
file from UTF-8 to the very same 1251 (ANSI - Latin 1) encoding. I did the conversion using Notepad++.
After converting it, did the regular import (through SSMS Tasks -> "Import Data" wizard), selecting the 1251 (ANSI - Latin 1) encoding, and everything was imported correctly.
Environment:
SQL Server Web 2016
SQL Server Management Studio v17.9.1
Notepad++ v7.7.1
Also, this answers too the original OP's question:
Is there anything I can to do in order to get these flat files into the database either by converting them before an insert or a process to run during the insert?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With