I was trying to understand better the design decision choice when making table entries in cassandra and when the blob type is a good choice.
I realized I didn't really know when to choose a blob as a data type because I was not sure what a blob really was (or what the acronym stood for). Thus I decided to read the following documentation for the data type blob:
http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/blob_r.html
Blob
Cassandra 1.2.3 still supports blobs as string constants for input (to allow smoother transition to blob constant). Blobs as strings are
now deprecated and will not be supported in the near future. If you were using strings as blobs, update your client code to switch to blob constants. A blob constant is an hexadecimal number defined by 0xX+ where hex is an hexadecimal character, such as [0-9a-fA-F]. For example, 0xcafe.
Blob conversion functions A number of functions convert the native types into binary data (blob). For every <native-type> nonblob type supported by CQL3, the
typeAsBlob function takes a argument of type type and returns it as a blob. Conversely, the blobAsType function takes a 64-bit blob argument and converts it to a bigint value. For example, bigintAsBlob(3) is 0x0000000000000003 and blobAsBigint(0x0000000000000003) is 3.
What I got out of it is that its just a long hexadecimal/binary. However, I don't really appreciate when I would use it as a column type for a potential table and how its better or worse than other type. Also, going through some of its properties might be a good way to figure out what situations blobs are good for.
Blobs (Binary Large OBjectS) are the solution for when your data doesn't fit into the standard types provided by C*. For example, say you wanted to make a forum where users were allowed to upload files of any type. To store these in C* you would use a Blob column (or possibly several blob columns since you don't want individual cells to become to large).
Another example might be a table where users are allowed to have a current photo, this photo could be added as a blob and be stored along with the rest of the user information.
Accoring to 3.x document, blob type is suitable for storing a small image or short string.
In my case I used it to store a hashed value, as hash function returns binary and the best option is to store as binary from the view of table data size. (Converting to string and store as string(text) could be also ok, if size not considered.)
Results below shows my test in local machine (insert 1 million records) and the sizes are 52,626,907(binary) and 72,879,839(base64-converted data as string). unit: byte.
CREATE TABLE IF NOT EXISTS testks.bin_data (
bin_data blob,
PRIMARY KEY(bin_data)
);
CREATE TABLE IF NOT EXISTS testks.base64_data (
base64_data text,
PRIMARY KEY(base64_data)
);
cqlsh> select * from testks.base64_data limit 10;
base64_data
------------------------------
W0umEPMzL5O81v+tTZZPKZEWpkI=
bGUzPm4zRvcqK1ogwTvPNPNImvk=
Nsr0GKx6LjXaiZSwATU38Ffo7fA=
A6lBV69DbFz/UFWbxolb+dlLcLc=
R919DvcyqBUup+NrpRyRvzJD+/E=
63LEynDKE5RoEDd1M0VAnPPUtIg=
FPkOW9+iPytFfhjdvoqAzbBfcXo=
uHvtEpVIkKivS130djPO2f34WSM=
fzEVf6a5zk/2UEIU8r8bZDHDuEg=
fiV4iKgjuIjcAUmwGmNiy9Y8xzA=
(10 rows)
cqlsh> select * from testks.bin_data limit 10;
bin_data
--------------------------------------------
0xb2af015062e9aba22be2ab0719ddd235a5c0710f
0xb1348fa7353e44a49a3363822457403761a02ba8
0x4b3ecfe764cbb0ba1e86965576d584e6e616b03e
0x4825ef7efb86bbfd8318fa0b0ac80eaa2ece9ced
0x37bdad7db721d040dcc0b399f6f81c7fd2b5cea6
0x3de4ca634e3a053a1b0ede56641396141a75c965
0x596ec12d9d9afeb5b1b0bb42e42ad01b84302811
0xbf51709a8d1a449e1eea09ef8a45bdd2f732e8ec
0x67dcb3b6e58d8a13fcdc6cf0b5c1e7f71b416df6
0x7e6537033037cc5c028bc7c03781882504bdbd65
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With