We are migrating some data from sql server to oracle. For columns defined as NVARCHAR
in SQL server we started creating NVARCHAR
columns in Oracle thinking them to be similar..But it looks like they are not.
I have read couple of posts on stackoverflow and want to confirm my findings.
Oracle VARCHAR2 already supports unicode if the database character set is say AL32UTF8 (which is true for our case).
SQLServer VARCHAR
does not support unicode. SQLServer explicitly requires columns to be in NCHAR/NVARCHAR
type to store data in unicode (specifically in the 2 byte UCS-2 format)..
Hence would it be correct to say that SQL Server NVARCHAR columns can/should be migrated as Oracle VARCHAR2 columns ?
In our Oracle database server, the NVARCHAR2 data type uses AL16UTF16 character set which encodes Unicode data in the UTF-16 encoding. The AL16UTF16 use 2 bytes to store a character. The NVARCHAR2 stores variable-length character data.
Varchar makes use of non-Unicode data while Nvarchar makes use of Unicode data. What is this? The maximum length also varies. Varchar length is limited to 8000 bytes and 4000 bytes is the limit for Nvarchar.
The NVARCHAR data type stores character data in a variable-length field. Data can be a string of single-byte or multibyte letters, digits, and other characters that are supported by the code set of your database locale.
The key difference between varchar and nvarchar is the way they are stored, varchar is stored as regular 8-bit data(1 byte per character) and nvarchar stores data at 2 bytes per character. Due to this reason, nvarchar can hold upto 4000 characters and it takes double the space as SQL varchar.
In Oracle, the NVARCHAR data type exists to allow applications to store data using a Unicode character set when the database character set does not support Unicode. One thing to be aware of in migrating, however, is character length semantics. In SQL Server, a NVARCHAR(20) allocates space for 20 characters which requires up to 40 bytes in UCS-2.
An nvarchar column can store any Unicode data. A varchar column is restricted to an 8-bit codepage. Some people think that varchar should be used because it takes up less space. I believe this is not the correct answer.
About the NVARCHAR data type: As in the output above, you will observe DATALENGTH column is showing only 10 as a value. That is because it occupies 2 bytes of space for each character and the data length is only 5 characters, therefore it will occupy 10 bytes of space in the database.
An Oracle table can define multiple VARCHAR2 columns. This size of TEXT data is suitable for most applications. This Microsoft SQL Server T-SQL-specific enhancement to SQL allows users to define and name their own data types to supplement the system data types. A user-defined data type can be used as the data type for any column in the database.
Yes, if your Oracle database is created using a Unicode character set, an NVARCHAR
in SQL Server should be migrated to a VARCHAR2
in Oracle. In Oracle, the NVARCHAR
data type exists to allow applications to store data using a Unicode character set when the database character set does not support Unicode.
One thing to be aware of in migrating, however, is character length semantics. In SQL Server, a NVARCHAR(20)
allocates space for 20 characters which requires up to 40 bytes in UCS-2. In Oracle, by default, a VARCHAR2(20)
allocates 20 bytes of storage. In the AL32UTF8
character set, that is potentially only enough space for 6 characters though most likely it will handle much more (a single character in AL32UTF8
requires between 1 and 3 bytes. You probably want to declare your Oracle types as VARCHAR2(20 CHAR)
which indicates that you want to allocate space for 20 characters regardless of how many bytes that requires. That tends to be much easier to communicate than trying to explain why some 20 character strings are allowed while other 10 character strings are rejected.
You can change the default length semantics at the session level so that any tables you create without specifying any length semantics will use character rather than byte semantics
ALTER SESSION SET nls_length_semantics=CHAR;
That lets you avoid typing CHAR
every time you define a new column. It is also possible to set that at a system level but doing so is discouraged by the NLS team-- apparently, not all the scripts Oracle provides have been thoroughly tested against databases where the NLS_LENGTH_SEMANTICS
has been changed. And probably very few third-party scripts have been.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With