I'm looking for a database matching these criteria: <ul> <li>May be non-persistent;</li> <li>Almost all keys of DB need to be updated once in 3-6 hours (100M+ keys with total size of 100Gb)</li> <li>Ability to quickly select data by key (or Primary Key)</li> <li>This needs to be a DBMS (so LevelDB doesn't fit)</li> <li>When data is written, DB cluster must be able to serve queries (single nodes can be blocked though)</li> <li>Not in-memory – our dataset will exceed the RAM limits</li> <li>Horizontal scaling and replication</li> <li>Support full rewrite of all data (MongoDB doesn't clear space after deleting data)</li> <li>C# and Java support</li> </ul> Here's my process of working with such database: We've got an analytics cluster that produces 100M records (50GB) of data every 4-6 hours. The data is a "key - array[20]". This data needs to be distributed to users through a front-end system with a rate of 1-10k requests per second. In average, only ~15% of the data is requested, the rest of it will be rewritten in 4-6 hours when the next data set is generated. What i tried: <ol> <li>MongoDB. Datastorage overhead, high defragmentation costs.</li> <li>Redis. Looks perfect, but it's limited with RAM and our data exceeds it.</li> </ol> So the question is: is there anything like Redis, but not limited with RAM size?

Yes, there are two alternatives to Redis that are not limited by RAM size while remaining compatible with Redis protocol: Ardb (C++), replication(Master-Slave/Master-Master): https://github.com/yinqiwen/ardb <blockquote> A redis-protocol compatible persistent storage server, support LevelDB/KyotoCabinet/LMDB as storage engine. </blockquote> Edis (Erlang): https://github.com/cbd/edis <blockquote> Edis is a protocol-compatible Server replacement for Redis, written in Erlang. Edis's goal is to be a drop-in replacement for Redis when persistence is more important than holding the dataset in-memory. Edis (currently) uses Google's leveldb as a backend. </blockquote> And for completeness here is another data-structures database: Hyperdex (Strings, Integers, Floats, Lists, Sets, Maps): http://hyperdex.org/doc/latest/DataTypes/#chap:data-types <blockquote> HyperDex is: <ul> <li>Fast: HyperDex has lower latency, higher throughput, and lower variance than other key-value stores. </li> <li>Scalable: HyperDex scales as more machines are added to the system. </li> <li>Consistent: HyperDex guarantees linearizability for key-based operations. Thus, a read always returns the latest value inserted into the system. Not just “eventually,” but immediately and always. </li> <li>Fault Tolerant: HyperDex automatically replicates data on multiple machines so that concurrent failures, up to an application-determined limit, will not cause data loss. Searchable: </li> <li>HyperDex enables efficient lookups of secondary data attributes. </li> <li>Easy-to-Use: HyperDex provides APIs for a variety of scripting and native languages. </li> <li>Self-Maintaining: A HyperDex is self-maintaining and requires little user maintenance.</li> </ul> </blockquote>

Is there something like Redis DB, but not limited with RAM size? [closed]

Tags:

database

redis

nosql

bigdata

I'm looking for a database matching these criteria:

May be non-persistent;
Almost all keys of DB need to be updated once in 3-6 hours (100M+ keys with total size of 100Gb)
Ability to quickly select data by key (or Primary Key)
This needs to be a DBMS (so LevelDB doesn't fit)
When data is written, DB cluster must be able to serve queries (single nodes can be blocked though)
Not in-memory – our dataset will exceed the RAM limits
Horizontal scaling and replication
Support full rewrite of all data (MongoDB doesn't clear space after deleting data)
C# and Java support

Here's my process of working with such database: We've got an analytics cluster that produces 100M records (50GB) of data every 4-6 hours. The data is a "key - array[20]". This data needs to be distributed to users through a front-end system with a rate of 1-10k requests per second. In average, only ~15% of the data is requested, the rest of it will be rewritten in 4-6 hours when the next data set is generated.

What i tried:

MongoDB. Datastorage overhead, high defragmentation costs.
Redis. Looks perfect, but it's limited with RAM and our data exceeds it.

So the question is: is there anything like Redis, but not limited with RAM size?

745

asked Aug 26 '13 15:08

Andrey

2 Answers

Yes, there are two alternatives to Redis that are not limited by RAM size while remaining compatible with Redis protocol:

Ardb (C++), replication(Master-Slave/Master-Master): https://github.com/yinqiwen/ardb

A redis-protocol compatible persistent storage server, support LevelDB/KyotoCabinet/LMDB as storage engine.

Edis (Erlang): https://github.com/cbd/edis

Edis is a protocol-compatible Server replacement for Redis, written in Erlang. Edis's goal is to be a drop-in replacement for Redis when persistence is more important than holding the dataset in-memory. Edis (currently) uses Google's leveldb as a backend.

And for completeness here is another data-structures database:

Hyperdex (Strings, Integers, Floats, Lists, Sets, Maps): http://hyperdex.org/doc/latest/DataTypes/#chap:data-types

HyperDex is:

Fast: HyperDex has lower latency, higher throughput, and lower variance than other key-value stores.

Scalable: HyperDex scales as more machines are added to the system.

Consistent: HyperDex guarantees linearizability for key-based operations. Thus, a read always returns the latest value inserted into the system. Not just “eventually,” but immediately and always.

Fault Tolerant: HyperDex automatically replicates data on multiple machines so that concurrent failures, up to an application-determined limit, will not cause data loss. Searchable:

HyperDex enables efficient lookups of secondary data attributes.

Easy-to-Use: HyperDex provides APIs for a variety of scripting and native languages.

Self-Maintaining: A HyperDex is self-maintaining and requires little user maintenance.

142

answered Sep 21 '22 18:09

FGRibreau

Yes, SSDB(https://github.com/ideawu/ssdb), it has very similar APIs to Redis: http://www.ideawu.com/ssdb/docs/php/

SSDB supports hash, zset. It use leveldb as storage engine, most data is stored on disk, RAM is used for cache. On our SSDB instance with 300GB data, it only uses 800MB RAM.

answered Sep 18 '22 18:09

ideawu

Related questions
                            
                                MySQL: What is a page?
                            
                                How to set max_connections in MySQL Programmatically
                            
                                Would you store binary data in database or in file system? [closed]
                            
                                In MongoDB's pymongo, how do I do a count()?
                            
                                Race conditions in django
                            
                                How to access ssis package variables inside script component
                            
                                How to transfer ASP.NET MVC Database from LocalDb to SQL Server?
                            
                                Storing and querying JSON from a database
                            
                                How to update MySql timestamp column to current timestamp on PHP?
                            
                                What does INT(5) in mysql mean?
                            
                                VS 2012 Database Project "unresolved reference to object" Error
                            
                                Differences between Database and Schema using different databases?
                            
                                postgresql where does the output of pg_dump go
                            
                                PostgreSQL CSV import from command line
                            
                                SQL Row_Number() function in Where Clause without ORDER BY?
                            
                                ActiveRecord finding existing table indexes
                            
                                Android SimpleCursorAdapter doesn't update when database changes
                            
                                Access MongoDB directly via JavaScript
                            
                                When to use a key-value data store vs. a more traditional relational DB? [closed]
                            
                                How many significant digits should I store in my database for a GPS coordinate?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With