Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle CLOB vs BLOB

Tags:

oracle

blob

I want to know what does Oracle's CLOB has to offer over BLOB data type. Both have data storage limits of (4 GB - 1) * DB_BLOCK_SIZE.

A text string which is longer than 4000 bytes can not fit in VARCHAR2 column. Now, I can use CLOB and BLOB as well to store this string.

Everyone says, CLOB is good and meant for character data and BLOB is for binary data such as images, unstructured documents.

But I see I can store character data inside a BLOB as well.

What I want to know:

So, question is on the basics, why CLOB and why not BLOB always? Is there anything to do with encoding?

May be the question title should be, How CLOB handles the character data differently than a BLOB?

like image 619
CourseTriangle Avatar asked Oct 28 '22 16:10

CourseTriangle


1 Answers

I want to know how BLOB treats the character type data.

It doesn't treat it as character type data, it only see it as a stream of bytes - it doesn't know or care what it represents.

From the documentation:

The BLOB data type stores unstructured binary large objects. BLOB objects can be thought of as bitstreams with no character set semantics.


Does clob stores the conding information along with it and uses it while retrieving the data ?

Not explicitly, but the data is stored in the database character set, as with VARCHAR2 data. From the documentation again:

The CLOB data type stores single-byte and multibyte character data. Both fixed-width and variable-width character sets are supported, and both use the database character set.

You might also have noticed that the dbms_lob package has procedures to convert between CLOB and BLOB data types. For both of those you have to specify the character set to use. So if you choose to store character data as a BLOB you have to know the character set when converting it to a BLOB, but perhaps more crucially you have to know the character set to be able convert it back. You can do it, but it doesn't mean you should. You have no way to validate the BLOB data until you come to try to convert it to a string.

As @APC alluded to, this is similar to storing a date as a string - you lose the advantages and type-safety using the correct data type would give you, and instead add extra pain, uncertainty and overhead for no benefit.

The question isn't really what advantages CLOBs have over BLOBs for storing character data; the question is really the reverse: what advantages do BLOBs have over CLOBs for storing character data? And the answer is usually that there are none.

@Boneist mentions the recommendation to store JSON as BLOBs, and there is more about that here.

(The only other reasons I can think of off-hand are that you have to store data from multiple source character sets and want to preserve them exactly as you received them. But then either you are only storing them and will never examine or manipulate the data from within the database itself, and will only return them to some external application untouched; in which case you don't care about the character set - so you're handling purely binary data and shouldn't be thinking of it as character data at all, any more than you'd care that an image you're storing is PNG vs. JPG or whatever. Or you will need to work with the data and so will have to record which character set each BLOB object represents, so you can convert as needed.)

like image 175
Alex Poole Avatar answered Nov 01 '22 15:11

Alex Poole