I am running a Flask server which loads data into a MongoDB database. Since there is a large amount of data, and this takes a long time, I want to do this via a background job.
I am using Redis as the message broker and Python-rq to implement the job queues. All the code runs on Heroku.
As I understand, python-rq uses pickle to serialise the function to be executed, including the parameters, and adds this along with other values to a Redis hash value.
Since the parameters contain the information to be saved to the database, it quite large (~50MB) and when this is serialised and saved to Redis, not only does it take a noticeable amount of time but it also consumes a large amount of memory. Redis plans on Heroku cost $30 p/m for 100MB only. In fact I every often get OOM errors like:
OOM command not allowed when used memory > 'maxmemory'.
I have two questions:
Your thoughts on the best solution are much appreciated!
Since you mentioned in your comment that your task input is a large list of key value pairs, I'm going to recommend the following:
Using the method above, you'll be able to:
For use cases like what you're doing, this will be MUCH faster and require much less overhead than sending these items through your queueing system.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With