Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should I consider using a in memory database and what are the issue to look out for?

I was just think that now it is common to have enough RAM on your database server to cache your complete database why are the specialist in memory database (e.g TimesTen, see also Wikipedia page) that were all the rage a few years ago not being used more?

It seems to be that as time go on, none disk based databases are being used less, e.g most applications are now built on conventional rational databases. I would have expected the opposite as RAM is getting close to being free for a lot of servers.

I am asking this, as I just read up on the stack-overflow-architecture and the page says

This is significant because Stack Overflow's database is almost completely in RAM and the joins still exact too high a cost.

But I don’t think this would be a problem if “pointers” and “collections” were used instead of the normal btree. Btree are a very clever to get round limits on disk access speed, e.g they trade CPU useage to reduce disk usage. However we now have so match ram.

But we still need database, as doing your own

  • Locking
  • Deadlock detection
  • Transaction logging
  • Recovering
  • Etc

Is very hard.

@S.Lott, Given we all spend so long choosing indexes, avoiding joins and investigating database performance problems. There must be a better way. A few years ago we were told the “in memory databases” was the better way. So before I jump into using one etc, I wish to know why other people are not using them more.

(I am unlikely to use TimesTen myself, as it is high priced ($41,500.00 / Processor) and I don’t like talking to Oracle sales people - I rather spend my time writing code.)

See also:

  • Alternative to the TimesTen in memory database
  • Has anyone published a detailed comparison between different in-memory RDBMSs?

Update:

I asked this question a LONG time ago, these days Microsoft SQL Server have "In-Memory OLTP" that is a memory-optimized database engine integrated into the SQL Server engine. It is not cheap, but seems to be very fast for some workloads.

like image 539
Ian Ringrose Avatar asked Oct 20 '09 10:10

Ian Ringrose


People also ask

When should you use in-memory database?

The main use case for in-memory databases is when real-time data is needed. With its very low latency, RAM can provide near-instantaneous access to the needed data. Because of the potential data losses, in-memory databases without a persistence mechanism should not be used for mission-critical applications.

Why would you use an in-memory database?

An in-memory database stores data close to the CPU to accelerate processing and provide you with real-time insights, without compromising performance. An in-memory database stores data close to the CPU to accelerate processing and provide you with real-time insights, without compromising performance.


1 Answers

Nobody really answered the question "When should I consider using a in memory database and what are the issue to look out for?" so I'll give it a go.

You should consider an in-memory database if: 1. The target system has data to manage, but no persistent media 2. The performance requirement simply cannot be met with a persistent database

For #1, think of the TV Guide in your set-top box (STB). Low-end STB (i.e. those with no DVR capability) have no persistent storage, and need no persistent storage. But the database for a 400-channel, 14-day TV Guide is non-trivial. There's a performance requirement here, too, because data arrives from the transponder carousel at a high speed and it's a case of 'capture it or wait until the carousel comes around again'. But there's no need for persistence. We've all seen this; when you lose power at your home, when it comes back on the TV Guide says "will be available shortly" because it's provisioning itself from the transponder or cable head-end. Network routers share the same characteristics: no persistent storage, need to be fast, and the database can be provisioned from an external source (peer routers on the network, in this case, to repopulate the routing table).

There are endless examples of #2: Real-time targetting in military systems, high-frequency trading systems, and more.

Regarding the second part of the question, "issue to watch out for": There are many.

Make sure you're evaluating a true in-memory database if you need the performance that only an in-memory database can deliver. Caching a persistent database is not the same. Throwing a persistent database in a RAM-drive is not the same. Using an in-memory database that inherently does transaction logging (like TimesTen) is not the same (even if you log to /dev/null).

Make sure you're evaluating a database system, and not merely a cache (e.g. memcache). A database system will have support for transactions with the ACID properties, multiple indexing options, support concurrent access, and more.

About ACID: in-memory database systems do not lack the 'D' (durability). It simply has to be taken in context. Transactions in a persistent database are durable only so long as the media it's stored on is durable. The same thing is true for in-memory databases. In either case, if you care about durability, you better have a backup.

like image 58
Steven Graves Avatar answered Oct 18 '22 15:10

Steven Graves