Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Has anyone worked with Aerospike? How does it compare to MongoDB? [closed]

Can anyone say if Aerospike is as good as they claim it to be? I'm a bit skeptical since it's a commercial enterprise. As far as I understand they just released a open source version, but the claims on their website could still be exaggerated.

I'm especially interested on how Aerospike compares to MongoDB.

like image 243
Ole Spaarmann Avatar asked Aug 08 '14 17:08

Ole Spaarmann


People also ask

Is Aerospike a NoSQL database?

Aerospike Database is a flash memory and in-memory open source distributed key value NoSQL database management system, marketed by the company also named Aerospike.

Why Aerospike is better than Redis?

Aerospike best suits me because it is able to scale with performance and with no hard work, and different than Redis it is also designed to persist your data completely, minimizing data loss in any event.

Why are you using Aerospike?

Aerospike is an excellent database for a recommendation engine. Key features are large lists ( for efficiently recording behavior), optimized Flash support to handle datasets from terabytes to petabytes, queries and aggregations for real-time reporting, and strong support for languages such as Python and Go.

What is one of the main features of Aerospike?

Aerospike allows value-based queries using secondary indexes, where string and integer bin values are indexed and searched using equality (string or numeric) or range (numeric) filters. User-Defined Functions (UDFs) extend the functionality and performance capabilities of the Aerospike Database engine.


2 Answers

Speed

Aerospike is faster. Almost any system will be quick with low load or simple data access but Aerospike has stayed consistently fast by optimizing for in-memory and SSD-based storage options. Mongo is fast when used with lots of RAM where for caching but is otherwise slow and has low write performance.

Reliability

Aerospike is very stable, although with simpler data access. MongoDB has historically been problematic with persisting data and failover but is much better now. Because Aerospike has better performance and easier management, it leads to less potential problems when scaling.

Setup/Configuration

The clustering with Aerospike is much easier to setup since all nodes are the same and the client drivers handle connections and failover automatically. MongoDB can be easier if you're setting up a single server as it runs on more platforms natively and you can start it without any configuration.

MongoDB has two major ways of clustering, replica sets (for availability) and sharding (for scalability). We had 5 shards and each shard had a replica-set of 3 servers. That's 15 servers to hold data. Then we had 3 config servers that maintained the cluster configuration and had to add 2 arbiter processes after our first major outage to deal with properly escalating a slave to master. That's a lot of moving pieces and also makes it incredibly hard to change your layout in the future.

In contrast, Aerospike has took much less effort but requires more configuration, most of which cannot be changed once the cluster has started whereas with MongoDB you can create and alter databases anytime.

Aerospike does have the ability to sync multiple clusters (which is complicated to setup) so you can have different active datacenters replicating data and accepting writes, something that MongoDB doesn't really support at all.

Data Access

MongoDB has database/collection/document where each document is just json. Aerospike has namespace/set/record where each record is a collection of key-value "bins", which can then have nested key/value structures. Namespaces are pre-configured and are not dynamic, and names for properties are limited to 14 characters which is annoying to work with.

Both have secondary indexes although MongoDB lets you query immediately by anything while Aerospike requires index setup or custom scripting. Both have built-in aggregation frameworks. Aerospike clients support LUA scripting but MongoDB supports map-reduce and custom javascript functions.

It really depends on what your application needs, but MongoDB wins in flexibility, easier querying and less restrictions.

Cost

Both are now open-source and free. Both have enterprise versions with extra features, but licensing is expensive if you have lots of data. Aerospike might be cheaper since it requires less machines for the same performance.

Overall

For most scenarios, I would recommend Aerospike. The document-store semantics and flexibility of MongoDB are great but scaling and maintaining it as a distributed database is painful. Aerospike is fast and reliable and can run with fewer nodes that are easier to scale.


January 2016: MongoDB has released MongoDB Cloud Manager which is a paid SaaS service that can provision and manage your clusters. This solves a lot of the trouble with configuring Mongo.

March 2017: Both databases have come a long way. Aerospike now has faster replication and more flexible config settings without restarting the whole cluster. MongoDB has new schema enforcement, better performance and even supports joins along with MongoDB Atlas managed service to take away all the scaling issues.


I now highly recommend ScyllaDB which is a Cassandra compatible open-source database with incredible performance, multi-datacenter replication, and no limits on usage.

like image 21
Mani Gandham Avatar answered Nov 25 '22 15:11

Mani Gandham


I have used Aerospike, MongoDB and Redis and have tested many other NoSQL databases. I would say Aerospike is very good at what it does but it is different than MongoDB. Everything depends on what you are planning on using a database for. I can give you an example of what I am using my different databases for. I can also go over the differences between them and discuss the benefits of Aerospike.

MongoDB

I am using MongoDB as a SQL alternative. In my MongoDB database I have many different fields. Often times the fields are changing and I will randomly need to query on various fields. It is a very unstructured database and MongoDB is amazing at that. I have also used MongoDB as a standard key-value store. It performs well but I have had MongoDB perform sub-optimally at both transaction scale and database size scale. Admittedly, the database might have been optimized a little better but I find it very hard to find documentation on configuring MongoDB correctly in different situations.

Redis

Redis is a pure key-value store. Redis' biggest problem is that it is purely in-memory (it will use disk as a backup but you cannot store more information than you have memory available). It is extremely fast for what it is used for. I personally use it for a small transactional database: I do very simple functions on keys like counting how many times an event happened for a certain user. I also do quick in-memory look ups that I need mapped to different values. Redis is a great tool for a small dataset and it is extremely fast. Configuration is very easy as well.

Aerospike

I personally use Aerospike to replace Redis when it's time to scale. From my understanding, it can be used for more. Like Redis, Aerospike is a key-value store. I believe the open source edition also supports secondary indexes which Redis does not (I have not used secondary indexes in production but have done little testing on them).

Aerospike's best feature is its ability to scale. The biggest problem I needed to solve when looking into Aerospike was scaling my system to handle large data sets while remaining extremely fast. The project I use Aerospike for has very stringent requirements on speed. I usually make 3-4 database lookups plus other processing and need to have sub-50ms transaction times. A few look-ups are on data sets which are 300GB+. I could not find a solution to hold this data and make it accessible in a reasonable amount of time. Redis obviously won't work unless I had a machine which had 300GB+ of RAM. MongoDB started to perform extremely poorly at a size much lower than 300GB. So I gave Aerospike a shot, and it was able to handle everything very well. The best thing about Aerospike: as my data set has grown I have not had to do much more than standing up a new box when needed. The speed has stayed consistent.

I also find Aerospikes documentation very good. It isn't too hard to configure and it's pretty easy to find answers for any issue that comes up.

Conclusion

So, is Aerospike is as good as they claim? Personally, I have seen nothing less than what has been claimed. I haven't had to scale to 1 million TPS but I do believe with enough hardware that would be possible. I also believe the numbers showing a speed difference between Aerospike and MongoDB. Aerospike is a much more "configured" and "planned out" database than MongoDB. Because of this Aerospike will be much faster at scale than MongoDB. It only has to worry about a single (or in case of secondary indices, a few hundred) indexes unlike MongoDB which can change dynamically. The question you really need to be asking is what you are trying to accomplish with your database. Then look into which database will fit your needs best. If you need a scalable, fast, key-value store database I would say Aerospike is probably the best out there.

Let me know if you have any specific questions or need anything clarified. I would probably be able to help you out.

like image 119
Eumcoz Avatar answered Nov 25 '22 17:11

Eumcoz