characters appearing incorrectly even with Unicode source and destination (SSIS)

Question

I am having a codepage unicode/non unicode problem and need expertise to understand it.

In SSIS I am reading data in from a UTF8 encoded text file. The datatypes are all DT_WSTR (unicode string). The destination is NVARCHAR which is also unicode.

Non standard characters such as Ú are not being encoded correctly )appearing as a black box question mark).

If the character appears correctly in the input file, the source is set to DT_WSTR & the destination is nvarchar, why is the character not rendering correctly?

I have tried setting the codepage of the source column to 65001, but in SSIS its only possible to change the codepage on a STR (non unicode) type.

Id appreciate any help in understanding why all unicode fields still cant store a unicode value correctly.

Update from the OP comments

It seems my output is ok if i use Unicode types end to end (input is DT_WSTR, destination column is nvarchar & when extracting again to text, output column is DW_WSTR. The only issue is sql server management studio, which does not seem to be able to render unicode characters correctly in the results of a query, when setting output to grid or text. this is a red herring and the process overall works without issue if this is ignored

Hadi · Accepted Answer

Trying to figure out the issue

There is not problem importing unicode characters from flat files to SQL Server destination, the only thing you have to do is the set the flat file encoding as unicode, and the result columns must be NVARCHAR. Based on your question, it looks like you have met the requirements so i can say that:

Unicode Character are imported successfully to SQL Server, but for some reasons SQL Server Management Studio cannot show unicode characters in a grid Results, to check that data is imported correctly, change change the result view to Result To Text.

GoTo Tools >> Options >> Query Results >> Results To Text

In the second reference link i provided they mentioned that:

If you use SSMS for your queries, change to output type from "Grid" to "Text", because depending on the font the grid can't show unicode.

Or you can try to change the Grid Results font, (on my machine, i use Tahoma font and it shows unicode characters normally)

enter image description here

Experiments

You can perform the following test (taken from the links below)

SET NOCOUNT ON;



CREATE TABLE #test

  (  id int IDENTITY(1, 2) NOT NULL Primary KEY

    ,Uni nvarchar(20) NULL);

INSERT INTO #test (Uni) VALUES (N'DE: äöüßÖÜÄ');

INSERT INTO #test (Uni) VALUES (N'PL: śćźłę');

INSERT INTO #test (Uni) VALUES (N'JAP: 言も言わずに');

INSERT INTO #test (Uni) VALUES (N'CHN: 玉王瓜瓦甘生用田由疋');




SELECT * FROM #test;


GO

DROP TABLE #test;

Try the following query using Result as Grid and Result as Text options.

References

SQL Server 2012 not showing unicode character in results
sql server 2008 not showing and inserting unicode characters!
Import UTF-8 Unicode Special Characters with SQL Server Integration Services
Microsoft SQL Server Management Studio - query result as text

characters appearing incorrectly even with Unicode source and destination (SSIS)

Tags:

sql-server

character-encoding

unicode

etl

ssis

Update from the OP comments

wilson_smyth

1 Answers

Trying to figure out the issue

Experiments

References

Hadi

Recent Activity

Donate For Us

characters appearing incorrectly even with Unicode source and destination (SSIS)

Tags:

sql-server

character-encoding

unicode

etl

ssis

Update from the OP comments

wilson_smyth

1 Answers

Trying to figure out the issue

Experiments

References

Hadi

Related questions

Recent Activity

Donate For Us