Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Locking an s3 object best practice?

I have an S3 bucket containing quite a few S3 objects that multiple EC2 instances can pull from (when scaling horizontally). Each EC2 will pull an object one at a time, process it, and move it to another bucket.

Currently, to make sure the same object isn't processed by multiple EC2 instances, my Java app renames it with a "locked" extension added to its S3 object key. The problem is that "renaming" is actually doing a "move". So the large files in the S3 bucket can take up to several minutes to complete its "rename", resulting in the locking process being ineffective.

Does anyone have a best practice for accomplishing what I'm trying to do?

I considered using SQS, but that "solution" has its own set of problems (order not guaranteed, possibility of messages delivered more than once, and more than one EC2 getting the same message)

I'm wondering if setting a "locked" header would be a quicker "locking" process.

like image 991
Todd Avatar asked Oct 26 '15 14:10

Todd


Video Answer


1 Answers

Object tag can assist here, as changing a tag doesn't create a new copy. Tag is kind of key/value pair associated to object. i.e. you need to use object level tagging.

like image 134
Tejaskumar Avatar answered Sep 27 '22 03:09

Tejaskumar