I'm after a little advice around using Cron jobs with PHP. My scenario is this:
I have a website with a large membership. Users have one or several URLS associated with their account. At midnight (or a certain time) I'd like to call a script which will query the websites for each user and update the database with the information it finds. Think of it as a sort of screen scraper service.
My question is around the stress of the server. I'll be testing this new feature on the shared server, but ultimately I will be moving to a dedicated server.
So if the c.5000 membership have 2 URLS each - that's 10,000 websites it would query. What do people think is the best way to do this? Have a cron job that runs the first 500 members - then 10 minutes later run the next 500 etc etc...
or is there some magic which I've not heard of which might help!?
Thanks for any tips!
cron is a great tool to use for basic concepts like this. However, it scales poorly, as you've surmised! Look into job processing tools, like the open-source (and multi-language) Gearman:
http://gearman.org/
This should be a more robust system for the task at hand.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With