Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB on EC2 server or AWS SimpleDB?

What scenario makes more sense - host several EC2 instances with MongoDB installed, or much rather use the Amazon SimpleDB webservice?

When having several EC2 instances with MongoDB I have the problem of setting the instance up by myself.

When using SimpleDB I have the problem of locking me into Amazons data structure right?

What differences are there development-wise? Shouldn't I be able to just switch the DAO of my service layers, to either write to MongoDB or AWS SimpleDB?

like image 555
Sebastian Hoitz Avatar asked Aug 02 '10 20:08

Sebastian Hoitz


People also ask

Does MongoDB run on EC2?

MongoDB can be installed on Amazon EC2 or deployed using the AWS Marketplace. First, you will need to get your deployment planning and set up the single production node. This will be followed up with setting up a place for storage before getting your MongoDB instance running.

Can MongoDB be hosted on AWS?

AWS enables you to set up the infrastructure to support MongoDB deployment in a flexible, scalable, and cost-effective manner on the AWS Cloud. This reference deployment will help you build a MongoDB cluster by automating configuration and deployment tasks.

What is Amazon SimpleDB used for?

Amazon SimpleDB is a highly available NoSQL data store that offloads the work of database administration. Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest.

What is the difference between AWS and MongoDB?

With MongoDB, you can configure the database to run virtually anywhere from a local machine, container, or on-premise deployment to any cloud provider. In contrast, you can only configure and use DynamoDB through AWS.


1 Answers

SimpleDB has some scalability limitations. You can only scale by sharding and it has higher latency than mongodb or cassandra, it has a throughput limit and it is priced higher than other options. Scalability is manual (you have to shard).

If you need wider query options and you have a high read rate and you don't have so much data mongodb is better. But for durability, you need to use at least 2 mongodb server instances as master/slave. Otherwise you can lose the last minute of your data. Scalability is manual. It's much faster than simpledb. Autosharding is implemented in 1.6 version.

Cassandra has weak query options but is as durable as postgresql. It is as fast as mongo and faster on higher data size. Write operations are faster than read operations on cassandra. It can scale automatically by firing ec2 instances, but you have to modify config files a bit (if I remember correctly). If you have terabytes of data cassandra is your best bet. No need to shard your data, it was designed distributed from the 1st day. You can have any number of copies for all your data and if some servers are dead it will automatically return the results from live ones and distribute the dead server's data to others. It's highly fault tolerant. You can include any number of instances, it's much easier to scale than other options. It has strong .net and java client options. They have connection pooling, load balancing, marking of dead servers,...

Another option is hadoop for big data but it's not as realtime as others, you can use hadoop for datawarehousing. Neither cassandra or mongo have transactions, so if you need transactions postgresql is a better fit. Another option is Amazon RDS, but it's performance is bad and price is high. If you want to use databases or simpledb you may also need data caching (eg: memcached).

For web apps, if your data is small I recommend mongo, if it is large cassandra is better. You don't need a caching layer with mongo or cassandra, they are already fast. I don't recommend simpledb, it also locks you to Amazon as you said.

If you are using c#, java or scala you can write an interface and implement it for mongo, mysql, cassandra or anything else for data access layer. It's simpler in dynamic languages (eg rub,python,php). You can write a provider for two of them if you want and can change the storage maybe in runtime by a only a configuration change, they're all possible. Development with mongo,cassandra and simpledb is easier than a database, and they are free of schema, it also depends on the client library/connector you're using. The simplest one is mongo. There's only one index per table in cassandra, so you've to manage other indexes yourself, but with the 0.7 release of cassandra secondary indexes will bu possible as I know. You can also start with any of them and replace it in the future if you have to.

like image 139
sirmak Avatar answered Oct 06 '22 13:10

sirmak