Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loading Unicode Characters with Oracle SQL Loader (sqlldr) results in question marks

I'm trying to load localized strings from a unicode (UTF8-encoded) csv using SQL Loader into an oracle database. I've tried all sort of combinations but nothing seems to give me the result I'm looking for which is to have special greek characters like (Δ) not get converted to Δ or ¿.

My table definition looks like this:

CREATE TABLE "GLOBALIZATIONRESOURCE"
(
    "RESOURCETYPE" VARCHAR2(255 CHAR) NOT NULL ENABLE,
    "CULTURE"      VARCHAR2(20 CHAR) NOT NULL ENABLE,
    "KEY"          VARCHAR2(128 CHAR) NOT NULL ENABLE,
    "VALUE"        VARCHAR2(2048 CHAR),
    "DESCRIPTION"  VARCHAR2(512 CHAR),
    CONSTRAINT "PK_GLOBALIZATIONRESOURCE" PRIMARY KEY ("RESOURCETYPE","CULTURE","KEY") USING INDEX TABLESPACE REPSPACE_IX ENABLE
)
TABLESPACE REPSPACE; 

I have tried the following configurations in my control file (and actually every permutation I could think of)

load data
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(   
    "RESOURCETYPE" CHAR(255), 
    "CULTURE" CHAR(20), 
    "KEY" CHAR(128), 
    "VALUE" CHAR(2048), 
    "DESCRIPTION" CHAR(512)
)

load data
CHARACTERSET UTF8
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(   
    "RESOURCETYPE" CHAR(255), 
    "CULTURE" CHAR(20), 
    "KEY" CHAR(128), 
    "VALUE" CHAR(2048), 
    "DESCRIPTION" CHAR(512)
)

load data
CHARACTERSET UTF16
TRUNCATE
INTO TABLE "GLOBALIZATIONRESOURCE"
FIELDS TERMINATED BY X'002c' OPTIONALLY ENCLOSED BY X'0022'
TRAILING NULLCOLS
(   
    "RESOURCETYPE" CHAR(255), 
    "CULTURE" CHAR(20), 
    "KEY" CHAR(128), 
    "VALUE" CHAR(2048), 
    "DESCRIPTION" CHAR(512)
)

With the first two options, the unicode characters don't get encoded and just show up as upside down question marks.

If I choose last option, UTF16, then I get the following error even though all my data in my fields are much shorter than the length specified.

Field in data file exceeds maximum length

It seems as though every possible combination of ctl file configurations (even setting the byte order to little and big) doesn't work correctly. Can someone please give an example of a configuration (table structure and CTL file) that correctly loads unicode data from a csv? Any help would be greatly appreciated.

Note: I've already been to http://docs.oracle.com/cd/B19306_01/server.102/b14215/ldr_concepts.htm, http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch10.htm and http://docs.oracle.com/cd/B10501_01/server.920/a96652/ch10.htm.

like image 613
philrabin Avatar asked Dec 06 '11 19:12

philrabin


People also ask

Which are the two methods of loading data using SQL * Loader?

SQL*Loader provides the following methods to load data: Conventional Path Loads. Direct Path Loads.

What is Sqlldr command?

Invoking SQL*Loader If you invoke SQL*Loader without specifying any parameters, SQL*Loader displays a help screen similar to the following. It lists the available parameters and their default values. > sqlldr . . . Usage: SQLLDR keyword=value [,keyword=value,...]


1 Answers

I had same issue and resolved by below steps -

  1. Open data file into Notepad++ , Go to "Encoding" dropdown and select UTF8 encoding and save file.
  2. use CHARACTERSET UTF8 into CTL file and then upload data.
like image 54
Vikas Katiyar Avatar answered Sep 21 '22 15:09

Vikas Katiyar