Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

maximum size of attributes on AWS SimpleDB

I am in the process of building an mobile application (iPhone/Android) and want to store the application data onto Amazon's SimpleDB, because we do not want to host our own server to provide these services. I've been going through all of the documentation and the maximum storage size of element values is 1024 bytes.

In my case we need to store 1024 up to 10K of text data.

I was hoping to find out how other projects are using SimpleDB when they have larger storage needs like our project. I read that one could store pointers to files that are then stored in S3 (file system). Not sure if that is a good solution.

In my mind I am not sure if SimpleDB is the correct solution. Could anyone comment on what that have done or provide a different way to think about this problem?

like image 664
Peter Delaney Avatar asked Jun 11 '09 12:06

Peter Delaney


People also ask

Is AWS SimpleDB deprecated?

SimpleDB is deprecated, more expensive than DDB, and kind of weird to use. Backing your keystore with a deprecated service just sounds like a road to many sleepless nights ;) The utility does depend on three external services: DynamoDB, KMS, and IAM (for permissioning).

What is SimpleDB in AWS?

Amazon SimpleDB is a highly available NoSQL data store that offloads the work of database administration. Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest.

Is SimpleDB relational?

Amazon SimpleDB is a highly available, flexible, and scalable non-relational data store that offloads the work of database administration.


1 Answers

There are ways to store your 10k text data but whether it will be acceptable will depend on what else you need to store and how you plan to use it.

If you need to store arbitrarily large data (especially binary data) then the S3 file pointer can be attractive. The value that SimpleDB adds in this scenario is the ability to run queries against the file metadata that you store in SimpleDB.

For text data limited to 10k I would recommend storing it directly in SimpleDB. It will easily fit in a single item but you'll have to spread it across multiple attributes. There are basically two ways to do this each with some draw backs.

One way is more flexible and search friendly but requires you to touch your data. You split your data up into chunks of about 1000 bytes and you store each chunk as an attribute value in a multi-valued attribute. There is no ordering imposed on multi-valued attributes so you have to prepend each chunk with a number for ordering (e.g. 01)

The fact that you have all the text stored in one attribute makes queries easy to do with a single attribute name in the predicate. You can add a different size text to each item anywhere from 1k to 200+k and it gets handled appropriately. But you do have to be aware that your prepended line numbers can pop positive for your queries (e.g. if you are searching for 01 every item will match that query).

The second way to store the text within SimpleDB does not require you to place arbitrary ordering data within your text chunks. You do the ordering by placing each text chunk in a different named attribute. For example you could use attribute names: desc01 desc02 ... desc10. Then you place each chunk in the appropriate attribute. You can still do full text search with both methods but the searches will be slower with this method because you will need to specify many predicates and SimpleDB will end up searching through a separate index for each attribute.

It may be easy to think of this type of work around as a hack because with databases we are used to having this type of low level detail handled for us within the database. SimpleDB is specifically designed to push this sort of thing out of the database and into the client as a means of providing availability as a first class feature.

If you found out that a relational database was splitting your text into 1k chunks to store on disk as an implementation detail it wouldn't seem like a hack. The problem is that the current state of SimpleDB clients is such that you have to implement a lot of this type of data formatting yourself. This is the type of thing that ideally will be handled for you in a smart client. There just aren't any smart clients freely available yet.

like image 196
Mocky Avatar answered Oct 19 '22 08:10

Mocky