I am wondering what's the real technical difference between Amazon S3 and Amazon SimpleDB. From my little understanding, both can be used to store and retrieve key/value pairs. So in what scenario would I need SimpleDB instead of S3? How is SimpleDB actually architected to deliver better performance (is it a file system, database backend or other)?
Additional thought: I know that SimpleDB is about storing key/value pairs. Is there any info as to how it generates the hash index from the key for a lookup?
Amazon describes the differences between S3 and SimpleDB on the SimpleDB overview page. They store S3 objects in slower storage and SimpleDB objects on faster storage, meaning that it's more expensive to store the same amount of data in SimpleDB than it is in S3.
If you're simply storing large binary objects as values and you're storing metadata on your own or can easily derive the key for the object you desire, then you'd probably just use S3.
SimpleDB would be used over S3 in the case where you want to store multiple key-value pairs associated with an item and want to retain the ability to find items based on any of the key-value pairs.
S3 allows you to store key-value metadata along with the object, but to find an object based on the metadata, you'd have to retrieve the metadata for each object in your bucket on your own and then decide which item (or items) you want to get. That can be slow and costly for large buckets. Additionally, there is a limit to the amount of metadata you can store and retrieve if you're using the REST API.
SimpleDB is more flexible with the stored metadata. The key-value pairs are indexed, so querying can be quick. You can add and modify key-value pairs that are already in SimpleDB, where you'd need to delete and recreate objects in S3 to update metadata. However, there is a 1024-byte limit to the size of key and values and an overall limit to the amount of data in a domain (bucket analogue). All of SimpleDB's limits are listed in the SDB Developer Guide.
If you're storing large objects with a lot of key-value metadata, you'd probably use a hybrid approach. Amazon's overview page suggests storing metadata in SimpleDB with one of the key-value pairs being a pointer to S3 for the object's data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With