I would like to create some sort of a distributed setup for running a ton of small/simple REST web queries in a production environment. For each 5-10 related queries which are executed from a node, I will generate a very small amount of derived data, which will need to be stored in a standard relational database (such as PostgreSQL).
What platforms are built for this type of problem set? The nature, data sizes, and quantities seem to contradict the mindset of Hadoop. There are also more grid based architectures such as Condor and Sun Grid Engine, which I have seen mentioned. I'm not sure if these platforms have any recovery from errors though (checking if a job succeeds).
What I would really like is a FIFO type queue that I could add jobs to, with the end result of my database getting updated.
Any suggestions on the best tool for the job?
Have you looked at Celery?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With