After reading over my other question, Using a Relational Database for Schema-Less Data, I began to wonder if a filesystem is more appropriate than a relational database for storing and querying schemaless data.
Rather than just building a file system on top of MySQL, why not just save the data directly to the filesystem? Indexing needs to be figured out, but modern filesystems are very stable, have great features like replication, snapshot and backup facilities, and are flexible at storing schema-less data.
However, I can't find any examples of someone using a filesystem instead of a database.
Where can I find more resources on how to implement a schemaless (or "document-oriented") database as a layer on top of a filesystem? Is anyone using a modern filesystem as a schemaless database?
A database is generally used for storing related, structured data, with well defined data formats, in an efficient manner for insert, update and/or retrieval (depending on application). On the other hand, a file system is a more unstructured data store for storing arbitrary, probably unrelated data.
My conclusion was, using the file system as a database is best for applications where the content is maintained by a limited number of administrators and concurrency writes are rarely a concern. But you want to have as more cheap reads as possible. For those case scenarios this idea can be quite a money saver.
1. MongoDB. This open-source database powers many web and mobile applications. It allows for single-shard transactions with ACID guarantees.
Yes a filesystem could be taken as a special case of a NOSQL-like database system. It may have some limitations that should be considered during any design decisions:
pros: - - simple, intuitive.
things to think about:
richness of metadata - what types of data does it store, how does it let you query them, can you have hierarchal or multivalued attributes
speed of querying metadata - not all fs's are particularly well optimized with anything other than size, dates.
inability to join queries (though that's pretty much common to NoSQL)
inefficient storage usage (unless the file system performs block suballocation, you'll typically blow 4-16K per item stored regardless of size)
I got the same idea more than 15 years ago, when hosting costs and hardware limitations where very different from today.
My main motivation was to design a cheap and simple solution able to withstand traffic spikes. Another goal was to improve the security of the applications by removing SQL attack vectors out of the equation.
I end up with a simple document-oriented database, more like a wrapper around FS functions.
What started as a personal project out of curiosity proved to be very rewarding in the long run. I will try to list both pros and cons.
PROS:
CONS:
My conclusion was, using the file system as a database is best for applications where the content is maintained by a limited number of administrators and concurrency writes are rarely a concern. But you want to have as more cheap reads as possible. For those case scenarios this idea can be quite a money saver.
Disclaimer: Please don't judge me too hard :) I'm a programmer with an old mind set of being more a creator than a user of the out of the box solutions. I lived the times when programmers where doing a lot from scratch to fit their needs including... operating systems. I believe personal experiments (including reinventing the wheel) are good learning opportunities for anybody.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With