Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling UTF-8 characters in Oracle external tables

I have an external table that reads from a fixed length file. The file is expected to contain special characters. In my case the word containing special character is "Göteborg". Because "ö" is a special character, looks like Oracle is considering it as 2 bytes. That causes the trouble. The subsequent fields in the files get shifted by 1 byte thereby messing up the data. Has anyone faced the issue before. So far we have tried the following solution:

Changed the value of NLS_LANG to AMERICAN_AMERICA.WE8ISO8859P1
Tried Setting the Database Character set to UTF-8
Tried changing the NLS_LENGTH_SYMMANTIC to CHAR instead of BYTE using ALTER SYSTEM
Tried changing the External table characterset to: AL32UTF8
Tried changing the External table characterset to: UTF-8

Nothing works. Other details include:

  • File is UTF-8 encoded
  • Operating System : RHEL
  • Database: Oracle 11g

Any thing else that I might be missing? Any help will be appreciated. Thanks!

like image 205
SJoe Avatar asked Feb 09 '11 12:02

SJoe


1 Answers

The nls_length_semantics only pertains to the creation of new tables.

Below is what I did to fix this very problem.

  records delimited by newline
  CHARACTERSET AL32UTF8
  STRING SIZES ARE IN CHARACTERS 

i.e.

ALTER SESSION SET nls_length_semantics = CHAR
/
CREATE TABLE TDW_OWNER.SDP_TST_EXT
(
    COST_CENTER_CODE VARCHAR2(10)     NULL,
    COST_CENTER_DESC VARCHAR2(40)     NULL,
    SOURCE_CLIENT    VARCHAR2(3)      NULL,
    NAME1            VARCHAR2(35)     NULL
)
ORGANIZATION EXTERNAL
 ( TYPE ORACLE_LOADER
   DEFAULT DIRECTORY DBA_DATA_DIR
   ACCESS PARAMETERS
    ( records delimited by newline
      CHARACTERSET AL32UTF8
      STRING SIZES ARE IN CHARACTERS 
        logfile DBA_DATA_DIR:'sdp_tst_ext_%p.log'
        badfile DBA_DATA_DIR:'sdp_tst_ext_%p.bad'
        discardfile DBA_DATA_DIR:'sdp_tst_ext_%p.dsc'
        fields
    notrim
       (
             COST_CENTER_CODE CHAR(10)
            ,COST_CENTER_DESC  CHAR(40)
            ,SOURCE_CLIENT  CHAR(3)
            ,NAME1  CHAR(35)
           )
    )
   LOCATION (DBA_DATA_DIR:'sdp_tst.dat')
 )
REJECT LIMIT UNLIMITED
NOPARALLEL
NOROWDEPENDENCIES
/
like image 145
Scott Avatar answered Nov 20 '22 17:11

Scott