Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clob size in bytes

Tags:

oracle

I have a database with the below NLS settings

NLS_NCHAR_CHARACTERSET  - AL16UTF16

NLS_CHARACTERSET - AL32UTF8

There's a table with a clob column storing a base64 encoded data. Since the characters are mostly english and letters, I would assume each character takes up 1 byte only as clob using the charset of NLS_CHARACTERSET for encoding.

With a inline enabled clob column, the clob will be stored inline unless it goes more that 4096 bytes in size. However, when I tried to store a set of data with 2048 chars, I found that it is not stored inline (By checking the table DBA_TABLES). So does it mean each character is not using only 1 byte? Can anyone elaborate on this?

Another test added: Create a table with clob column with chunk size 8kb so that initial segment size is 65536 bytes. After insert a row with 32,768 chars in clob column. The 2nd extent creation can be told by querying dba_segments.

like image 571
Quincy Avatar asked Sep 25 '14 13:09

Quincy


1 Answers

http://docs.oracle.com/cd/E11882_01/server.112/e10729/ch6unicode.htm#r2c1-t12

It says:

Data in CLOB columns is stored in a format that is compatible with UCS-2 when the database character set is multibyte, such as UTF8 or AL32UTF8. This means that the storage space required for an English document doubles when the data is converted

So it looks like CLOB internally stores everything as UCS-2 (Unicode), i.e. 2 bytes fixed per symbol. Consequently, it stores inline 4096/2 = 2048 chars.

like image 72
Rusty Avatar answered Oct 17 '22 17:10

Rusty