Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert characterset of column in Oracle

Tags:

java

oracle

jdbc

I have a table in which our service provider insert UNICODE data but my oracle characterset is WE8ISO8859P1.

Now to get that data i used following function of oracle but it is displaying ???????

select CONVERT(message,'AL32UTF8','WE8ISO8859P1') from client_campaigns

one thing more message coulmn is of CLOB type.

I cant change characterset of my Database due to dataloss and second its in production and changes in characterset may lead to errors.

Now please guide how i can get this data as UNICODE?

Regards, imran

like image 272
ImranRazaKhan Avatar asked Mar 14 '11 08:03

ImranRazaKhan


1 Answers

Strings inserted in a character (VARCHAR2 or CHAR or CLOB) column will always be converted to the database character set. This means that the inserted data is converted to WE8ISO8859P1 in your case. Since UNICODE is not a subset of WE8ISO8859P1, you will lose information. Some characters unavailable in your character set are transformed into ? upon insert.

What should you do? There are a couple of options for new data:

  1. Modify the datatype of the column to NVARCHAR2 instead of VARCHAR2 (or NCLOB instead of CLOB). NVARCHAR2 is specifically designed so you can handle multi-byte characters without modifying your main db character set. See this SO question for differences between VARCHAR2 and NVARCHAR2). Also be aware that some applications may not work correctly with NVARCHAR2.
  2. You could modify the column to a RAW or BLOB and write directly your string as a binary stream. When you read it again it will still be UNICODE data. It will be difficult for the database to do anything with this column data however: sorting will be binary, searching will be problematic as you won't be able to use the LIKE operator properly.
  3. If you have lots of UNICODE input, you could consider modifying your database character set. This would be the most costly option (you will probably need to export/reinstall/import) but afterwards all your columns will have the correct datatype.

I would go with option (1) or (3) if given the choice. Working with RAW disables a lot of features and adds complexity.

Obviously prior data will be impossible to restore with only the data available to the database: you will have to reimport old data in the new structure.

like image 161
Vincent Malgrat Avatar answered Sep 22 '22 22:09

Vincent Malgrat