I have a system in which I get a lot of messages. Each message has a unique ID, but it can also receives updates during its lifetime. As the time between the message sending and handling can be very long (weeks), they are stored in S3. For each message only the last version is needed. My problem is that occasionally two messages of the same id arrive together, but they have two versions (older and newer).
Is there a way for S3 to have a conditional PutObject request where I can declare "put this object unless I have a newer version in S3"?
I need an atomic operation here
That's not the use-case for S3, which is eventually-consistent. Some ideas:
You could try to partition your messages - all messages that start with A-L go to one box, M-Z go to another box. Then each box locally checks that there are no duplicates.
Your best bet is probably some kind of database. Depending on your use case, you could use a regular SQL database, or maybe a simple RAM-only database like Redis. Write to multiple Redis DBs at once to avoid SPOF.
There is SWF which can make a unique processing queue for each item, but that would probably mean more HTTP requests than just checking in S3.
David's idea about turning on versioning is interesting. You could have a daemon that periodically trims off the old versions. When reading, you would have to do "read repair" where you search the versions looking for the newest object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With