Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing media files in Cassandra

Tags:

cassandra

I tried to store the audio/video files in the database.

Is cassandra able to do that ? if yes, how do we store the media files in cassandra.

How about storing the metadata and original audio files in cassandra

like image 609
nsk Avatar asked Nov 07 '17 15:11

nsk


People also ask

Can we store files in Cassandra?

Create files folder at the root level to keep the images (which we will upload). In GetFile, we hit the Select query to get the blob stored in Cassandra table and write it into a file. After execution, you may check the new image created at the root folder. This approach is OK when your files are smaller in size(KBs).

Is Cassandra good for storing images?

Capable for storing 100's of millions of product images & assets. Highly Available and Extremely Fault Tolerant. Linear scalability.

When should you not use Cassandra?

When you want many-to-many mappings or join tables. Cassandra doesn't support a relational schema with foreign keys and join tables. So if you want to write a lot of complex join queries, then Cassandra might not be the right database for you.


1 Answers

Yes, Cassandra is definitely able to store files in its database, as "blobs", strings of bytes.

However, it is not ideal for this use case:

First, you are limited in blob size. The hard limit is 2GB size, so large videos are out of the question. But worse, the documentation from Datastax (the commercial company behind Cassandra's development) suggests that even 1 MB (!) is too large - see https://docs.datastax.com/en/cql/3.1/cql/cql_reference/blob_r.html.

One of the reasons why huge blobs are a problem is that Cassandra offers no API for fetching parts of them - you need to read (and write) a blob in one CQL operation, which opens up all sorts of problems. So if you want to store large files in Cassandra, you'll probably want to split them up into many small blobs - not one large blob.

The next problem is that some of Cassandra's implementation is inefficient when the database contains files (even if split up to a bunch of smaller blobs). One of the problems is the compaction algorithm, which ends up copying all the data over and over (a logarithmic number of times) on disk; An implementation optimized for storing files would keep the file data and the metadata separately, and only "compact" the metadata. Unfortunately neither Cassandra nor Scylla implement such a file format yet.

All-in-all, you're probably better off storing your metadata in Cassandra but the actual file content in a different object-store implementation.

like image 142
Nadav Har'El Avatar answered Sep 18 '22 15:09

Nadav Har'El