Displaying the hex value of a string from a oracle varchar2?

Tags:

We are having problems with text that is encoded in some different ways but kept in a single column in a table. Long story. On MySQL, I can do "select hex(str) from table where" and I see the bytes of the string exactly as I set them.

On Oracle, I have a string which starts with the Turkish character İ, which is the Unicode character 0x0130 "LATIN CAPITAL LETTER WITH DOT ABOVE". This is in my printed copy of the Unicode Version 2.0 book. In UTF-8, this character is 0xc4b0.

We have very old client apps we need to support. They would send us this text in "windows-1254". We used to just close our eyes, store it, and hand it back later. Now we need the Unicode, or are being given the Unicode.

So I have:

SQL> select id, name from table where that thing;

ID     NAME
------ ------------------------
746    Ý

This makes sense because the "İ" is 0xdd in windows-1254 and 0xdd in wondows-1252 is "Ý". My terminal is presumably set to the usual windows-1252.

But:

SQL> select id, rawtohex(name) from table where that thing;

ID     RAWTOHEX(NAME)
------ ------------------------
746    C39D

There seems to be no equivalent to the hex(name) function in MySQL. But I must be missing something. What am I missing here?

My java code has to take the utf8 that I am supplied and save a utf8 copy and a windows-1252 copy. The java code gives me:

bytes (utf8):  c4 b0
bytes (1254):  dd

Yet, when I save it, the client does not get the correct character. And when I try to see what Oracle has actually stored, i get the garbage seen above. I have no idea where the C39D is coming from. Any suggestions?

We have ojdbc14.jar built into all of our applications and we are connecting to a database that says it is "Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production".

379

asked Sep 09 '13 15:09

Ray Kiddy

2 Answers

I have bytes in UTF-8 to begin.

String strFromUTF8 = new String(bytes, "UTF8");
byte[] strInOldStyle = strFromUTF8.getBytes("Cp1254");

With MySQL, I am done. I takes these bytes, turn them into a hex string and do an update with unhex(hexStr). This allows me to put the legacy bytes into a varchar column.

With Oracle, I must do:

String again = new String(strInOldStyle, "Cp1254");
byte[] nextOldBytes = again.getBytes("UTF8");

Now, I can do an update and get the bytes into a varchar2 column with:

update table set colName = UTL_RAW.CAST_TO_VARCHAR2(HEXTORAW('hexStr')) where ...

Strange, no? I am sure I have made this more complex than it needed to be.

What we see is this, though,

"İ" in UTF-8 == 0xc4d0
"İ" in Cp1254 == 0xdd == "Ý" in Cp1252
"Ý" in UTF-8 == 0xc3d9

So, if I get the string "İ" and do:

update table set name = UTL_RAW.CAST_TO_VARCHAR2(HEXTORAW('C3D9')) where ...

Then our legacy client gives us a "İ". Yep. It works.

answered Oct 22 '22 13:10

Ray Kiddy

Use the dump function to see how Oracle stores data internally.

You seem to have a misunderstanding on how Oracle treats VARCHAR2 characters set conversions: you can't influence how Oracle stores its data physically. (Also if you haven't already, it's helpful to read: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets).

Your client speaks to Oracle only in binary. In fact all systems exchange information in binary only. To understand each others, it is necessary that both systems know what language (character set) is being used.

In your case we can reconstruct what happens:

Your client sends the byte dd to Oracle and says it is windows-1252 (instead of 1254).
Oracle looks up its character set table and sees that this data is translated to the symbol Ý in this character set.
Oracle logically stores this information in its table.
Since Oracle is setup in UTF-8, it converts this data to the UTF-8 binary reprensentation of Ý:
```
SQL> SELECT rawtohex('Ý') FROM dual;

RAWTOHEX('Ý')
--------------
C39D
```
Oracle stores C39D internally.

As you can see, the problem comes from the first step: there is a problem of setup. As long as you don't fix this, the systems won't be able to successfully dialogue.

The conversion is automatic when you use VARCHAR2 because this datatype is a logical text symbol interface (you have next to no control over forcing the actual binary data being stored).

184

answered Oct 22 '22 14:10

Vincent Malgrat

Related questions
                            
                                How do I generate random sample data in my Oracle database?
                            
                                java.sql.Date to joda time conversion
                            
                                Create a trigger that updates a column on one table when a column in another table is updated
                            
                                Oracle.ManagedDataAccess OracleInternal.NotificationServices.ONSException
                            
                                What is the equivalent of SQL Server APPLY in Oracle?
                            
                                How to convert CLOB to VARCHAR2 inside oracle pl/sql
                            
                                Maximum Connection Pool Size
                            
                                How can I pass a parameter to a t-sql script?
                            
                                Why doesn't Oracle raise "ORA-00918: column ambiguously defined" for this query?
                            
                                How to insert/update larger size of data in the Oracle tables?
                            
                                Connecting to Oracle Database using Sql Server Integration Services
                            
                                One SQL query to access multiple data sources in Java (from oracle, excel, sql server)
                            
                                Why is truncate a DDL statement?
                            
                                increment function in plsql
                            
                                Column to be modified is not an identity column
                            
                                Hibernate could not fetch the SequenceInformation from the database
                            
                                what does the "2"mean in SQL*Plus?
                            
                                Is there a way to get the row number in Mysql like the rownum in oracle [duplicate]
                            
                                How do I know the id before saving an object in jpa
                            
                                Get column value length, not column max length of value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Displaying the hex value of a string from a oracle varchar2?

Tags:

unicode

utf-8

oracle

ojdbc

turkish

Ray Kiddy

People also ask

2 Answers

Ray Kiddy

Vincent Malgrat

Recent Activity

Donate For Us