In the context of an application I am developing I need to store quite a good amount of data for limited amount of time - think of the scenario in terms of object buckets - for a certain limited amount of time objects are in some buckets then they move to others and others... and so on. There is basically a n to m relation between objects (order of millions) and buckets (thousands to maybe tens of thousands)
It is important for this storage layer to be persistent so that, in case of app / server failure, I can recreate the last state.
What would the best option be for implementing such temporary storage
Thanks
If an object can only belong to one bucket at a time then this should fit well with Postgres - you'd just need a buckets table with a set of unique bucket identifiers, and then the objects' tables would have a currentbucket column to indicate the bucket to which they currently belong.
If an object can belong to an unbounded number of buckets then you can still use Postgres, but you'll need to remove the currentbucket column from the objects' tables and instead have a bucketobjectjoin table with a column for bucket identifier and a column for object identifier. Since you're already using Postgres I'd recommend implementing it this way as a first pass. If you're not happy with the performance then you can cache the bucketobjectjoin table in Redis (as a Set of bucket identifiers keyed to an object identifier, and/or as a Set of object identifiers keyed to a bucket identifier) - you're only storing the objects' keys (not the full objects) in Redis so memory shouldn't be an issue, and you can have a background task occasionally sync Redis with Postgres's bucketobjectjoin table in case the Redis server crashes.
As a full blown nosql approach you can use Cassandra to store the full objects; Cassandra supports Sets like Redis does, but doesn't have Redis's memory restrictions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With