Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle Unicode problem when using NLS_CHARACTERSET is WE8ISO8859P1 and NLS_NCHAR_CHARACTERSET is AL16UTF16, and ColdFusion as programming language

I have 2 Oracle 10g database, XE and Enterprise

XE

enter image description here

enter image description here

Enterprise

enter image description here

enter image description here

and this are the data type I've use in the test table

enter image description here

enter image description here

and then I tried to test to insert some Unicode char from http://www.sustainablegis.com/unicode/

and the results are

XE

enter image description here

Enterprise

enter image description here

for this test, I use ColdFusion 9 developer edition

<cfprocessingDirective pageencoding="utf-8"> 
<cfset setEncoding("form","utf-8")>

<form action="" method="post">
Unicode : <br>
<textarea name="txaUnicode" id="txaUnicode" cols="50" rows="10"></textarea>
<br><br>
Language : <br>
<input type="Text" name="txtLanguage" id="txtLanguage">
<br><br>
<input type="Submit">
</form>

<cfset dsn = "theDSN">

<cfif StructKeyExists(FORM, "FIELDNAMES")>
    <cfquery name="qryInsert" datasource="#dsn#">
        INSERT INTO UNICODE
        (
            C_VARCHAR2,
            C_CHAR,
            C_CLOB,
            C_NVARCHAR2,
            LANGUAGE
        )
        VALUES
        (
            <cfqueryparam cfsqltype="CF_SQL_VARCHAR" value="#FORM.TXAUNICODE#">,
            <cfqueryparam cfsqltype="CF_SQL_CHAR" value="#FORM.TXAUNICODE#">,
            <cfqueryparam cfsqltype="CF_SQL_LONGVARCHAR" value="#FORM.TXAUNICODE#">,
            <cfqueryparam cfsqltype="CF_SQL_VARCHAR" value="#FORM.TXAUNICODE#">,
            <cfqueryparam cfsqltype="CF_SQL_VARCHAR" value="#FORM.TXTLANGUAGE#">
        )
    </cfquery>
</cfif>

<cfquery name="qryUnicode" datasource="#dsn#">
    SELECT  *
    FROM    UNICODE
    ORDER BY    LANGUAGE
</cfquery>

<table border="1">
    <thead>
        <tr>
            <th>LANGUAGE</th>
            <th>C_VARCHAR2</th>
            <th>C_CHAR</th>
            <th>C_CLOB</th>
            <th>C_NVARCHAR2</th>
        </tr>
     </thead>
     <tbody>
        <cfoutput query="qryUnicode">
            <tr>
                <td>#qryUnicode.LANGUAGE#</td>
                <td>#qryUnicode.C_VARCHAR2#</td>
                <td>#qryUnicode.C_CHAR#</td>
                <td>#qryUnicode.C_CLOB#</td>
                <td>#qryUnicode.C_NVARCHAR2#</td>
            </tr>
        </cfoutput>
    </tbody>
</table>

from this guide http://www.stanford.edu/dept/itss/docs/oracle/10g/server.101/b10749/ch6unicode.htm#i1007297 I think for my Enterprise database it should produce same thing as XE (at least for NVARCHAR2 column) since the typical solution from that guide said:

  • Use NCHAR and NVARCHAR2 datatypes to store Unicode characters
  • Keep WE8ISO8859P1 as the database character set
  • Use AL16UTF16 as the national character set

So, how to make it works too in my Enterprise database?

Thank you :)

like image 860
tsurahman Avatar asked Nov 06 '22 02:11

tsurahman


2 Answers

First database stores values in utf-8 encoding, second in iso-8859-1(besides N-datatypes), however you are writing out both values in utf-8 so first one is alright, but second one is wrong decoded.

like image 158
el vis Avatar answered Nov 09 '22 14:11

el vis


WE8ISO88591 encoding has a limited character set and can not store all unicode characters. Please refer to http://en.wikipedia.org/wiki/ISO/IEC_8859-1 for list of supported characters.

like image 44
Salman Hameed Avatar answered Nov 09 '22 14:11

Salman Hameed