Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is a Document based DB so fast?

I just want to understand better, in what I have learned for years is a document based solution is slow and requires a lot of I/O. FOr example in a PHP project, it is generally said it is much much better to use a memory cache like Redis ,Memecache, or APC because they are memory based instead of caching data to an actual FILE.

Now all these NoSQL DB's have arrived and I read about how they are so much faster then MySQl and others and they are Document Based. Can someone help me understand this theory? If each record is a Document (FILE), then how is it so good on performance? I recently read about a guy who was using Redis in a project and said he switched to MongoDB and is having better results then he did with Redis (I realize I am comparing a Cache to a DB, but that is not the real question, I want to know how a Document based solution is faster then non-document based solutions?)

like image 467
CodeDevelopr Avatar asked Jan 11 '12 09:01

CodeDevelopr


2 Answers

Document Based doesn't necessarily mean they are stored entirely on file system. Some parts can still be held in memory like an index.

Document based only means the database stores data in packages (like sheets of paper where every sheet is a dataset and you can write freely on it) instead of a very specific structure like a table.

http://en.wikipedia.org/wiki/Document-oriented_database

Ah and why they can be faster than redis:
Let's say you need to store some non-linear information in a set (i.e. not every dataset looks the same and you got different datatypes in one set. On Redis you can only store key-value pairs so you will need so link them back together to a set in your own code/implementation. On a NoSQL Database this is handled for you by the database in a (probably) much more optimized way :)

like image 128
bardiir Avatar answered Oct 17 '22 14:10

bardiir


The NoSQL speak can be prone to misunderstandings, as some of the concepts will use names, that have a different meaning to the traditional one:

  • File based doesn't (necessarily) mean, that the Datastore will write each record to a file - it is meant to say that records in the datastore will not have to conform to a predefines schema of fields if a certain data type. Think of "file" as something like XML, JSON or friends.
  • The performance wins of (most) NoSQL datastores comes at a price: Typically well-understood ACID promises are traded against a looser consistency model.
  • The power of relational SQL databases comes to a big part from the fact, that as good as every query can be written against an existing schema. This is not allways true with NoSQL datastores: In the most extreme version access to a record is possible only via a record-ID.
  • Most NoSQL datastores will scale much better than a typical relational Database - they are the answer to the question "What do we have to sacrifice from a well-understood relational DB" to overcome the scaling limits"
like image 42
Eugen Rieck Avatar answered Oct 17 '22 15:10

Eugen Rieck