Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wrong String encoding using JDBC Oracle Thin driver

I am using an Oracle database with ISO-8859-1 data. When I try to get String from this DB using ResultSet and print result to console, I get a wrong encoding output.

Locale.getDefault(); // -> fr_FR
Charset.defaultCharset(); // -> UTF-8

But I tried to print these data from my ResultSet :

rs.getString("MY_COL"); // direct from ResultSet
new String(rs.getString("MY_COL").getBytes(Charset.forName("ISO-8859-15")), Charset.forName("UTF-8")); // convert ISO bytes to UTF-8 bytes

This output :

générale
générale

So, why Oracle JDBC driver create String with ISO-8859-1 bytes encoding ? How can I get String with UTF-8 bytes encoding without altering database (nor converting String) ? Can I change it from the driver configuration ou JMV args ?

like image 340
Aure77 Avatar asked Apr 25 '26 04:04

Aure77


1 Answers

I guess your database is not in ISO 8859-1 (NLS_CHARACTERSET = WE8ISO8859P1).

On the database

create table foo (col1 varchar2(40));
insert into foo values('é');
insert into foo values(chr(233));
select dump(col1) from foo;

should return

Typ=1 Len=1: 233 
Typ=1 Len=1: 233 

If you get for example

Typ=1 Len=2: 195,169
Typ=1 Len=1: 233

then your database is set up for UTF8 (NLS_CHARACTERSET = AL32UTF8).

like image 130
SubOptimal Avatar answered Apr 27 '26 18:04

SubOptimal