Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alternative to Amazon S3 for the data center?

I'm looking to for a service that is similar to Amazon S3, a simple service to store and retrieve arbitrary data (and meta-data), but one that runs locally in your own data center. Strictly speaking, I'm not sure whether you would call this a CDN or a lightweight CMS.

It must be horizontally scalable (both for storage and bandwidth) and fault tolerable. It must also support REST, preferably WS too, with a pluggable authentication and authorization system. Something built with Java EE would be preferable for more convenient integration and extensibility, but this is just a personal preference, and it not a requirement.

Suggestions?

like image 840
jnorris Avatar asked Apr 27 '09 03:04

jnorris


People also ask

Can S3 be used as a data warehouse?

Amazon S3 provides an optimal foundation for a data lake because of its virtually unlimited scalability and high durability. You can seamlessly and non-disruptively increase storage from gigabytes to petabytes of content, paying only for what you use. Amazon S3 is designed to provide 99.999999999% durability.

What is the GCP equivalent of S3?

Google Storage / Bucket Security If you're familiar with AWS, Google Storage is GCP's version of AWS Simple Storage Service (S3) and an S3 bucket would be equivalent to a Google Storage bucket across the two clouds.

Is S3 better than HDFS?

To summarize, S3 and cloud storage provide elasticity, with an order of magnitude better availability and durability and 2X better performance, at 10X lower cost than traditional HDFS data storage clusters. Hadoop and HDFS commoditized big data storage by making it cheap to store and distribute a large amount of data.

Is S3 the same as DynamoDB?

2) Amazon S3 vs DynamoDB: Purpose For relatively small items, especially those with a size of less than 4 KB, DynamoDB runs individual operations faster than Amazon S3. DynamoDB can scale on-demand, but S3 offers better scalability. In case of huge volumes of traffic, DynamoDB can be overwhelmed for a while.


3 Answers

Here are a few open source solutions I have come across that deserve further research:

  1. Apache Sling (JCR based CMS (JSR170, JSR283), RESTful interface).
  2. Apache Hadoop (Java based distributed data-store, map reduce functionality).
  3. HBase (built on top of Hadoop, provding Google Bigtable-like capabilities).
  4. CouchDB (Erlang based key/value DB with Map/Reduce functionality, RESTful interface).
  5. Dynomite (Erlang based, Amazon dynamo clone).
  6. Voldemort (Distributed key-value storage system).
  7. Cassandra (highly scalable, eventually consistent, distributed, structured key-value store).
  8. MongoDB (highly scalable, JSON document based storage).
like image 131
jnorris Avatar answered Sep 29 '22 19:09

jnorris


Walrus project (mostly s3 api compatible) . . .

http://open.eucalyptus.com/wiki/EucalyptusStorage_v1.4

like image 24
tbond Avatar answered Sep 29 '22 18:09

tbond


Park place is an S3 clone in Ruby.

like image 25
1800 INFORMATION Avatar answered Sep 29 '22 20:09

1800 INFORMATION