How do I bulk upload to s3?

Question

I recently refactored some of my code to stuff rows into a db using 'load data' and it works great -- however for each record I have I must upload 2 files to s3 -- this totally destroys the magnificent speed upgrade that I was obtaining. Whereas I was able to process 600+ of these documents/second they are now trickling in at 1/second because of s3.

What are your workarounds for this? Looking at the API I see that it is mostly RESTful so I'm not sure what to do -- maybe I should just stick all this into the database. The text files are usually no more than 1.5k. (the other file we stuff in there is an xml representation of the text)

I already cache these files in HTTP requests to my web server as they are used quite a lot.

btw: our current implementation uses java; I have not yet tried threads but that might be an option

Recommendations?

Adam Hughes · Accepted Answer

You can use the [putObjects][1] function of JetS3t to upload multiple files at once.

Alternatively you could use a background thread to upload to S3 from a queue, and add files to the queue from your code that loads the data into the database.

[1]: http://jets3t.s3.amazonaws.com/api/org/jets3t/service/multithread/S3ServiceMulti.html#putObjects(org.jets3t.service.model.S3Bucket, org.jets3t.service.model.S3Object[])

How do I bulk upload to s3?

Tags:

performance

amazon-s3

upload

bulk

eyberg

1 Answers

Adam Hughes

Recent Activity

Donate For Us

How do I bulk upload to s3?

Tags:

performance

amazon-s3

upload

bulk

eyberg

1 Answers

Adam Hughes

Related questions

Recent Activity

Donate For Us