I have a Django view, which receives part of its data from an external website, which I parse using urllib2/BeautifulSoup.
This operation is rather expensive so I cache it using the low-level cache API, for ~5 minutes. However, each user which accesses the site after the cached data expires will receive a significant delay of a few seconds while I go to the external site to parse the new data.
Is there any way to load the new data lazily so that no user will ever get that kind of delay? Or is this unavoidable?
Please note that I am on a shared hosting server, so keep that in mind with your answers.
EDIT: thanks for the help so far. However, I'm still unsure as to how I accomplish this with the python script I will be calling. A basic test I did shows that the django cache is not global. Meaning if I call it from an external script, it does not see the cache data going on in the framework. Suggestions?
Another EDIT: coming to think of it, this is probably because I am still using local memory cache. I suspect that if I move the cache to memcached, DB, whatever, this will be solved.
So you want to schedule something to run at a regular interval? At the cost of some CPU time, you can use this simple app.
Alternatively, if you can use it, the cron job for every 5 minutes is:
*/5 * * * * /path/to/project/refresh_cache.py
Web hosts provide different ways of setting these up. For cPanel, use the Cron Manager. For Google App Engine, use cron.yaml
. For all of these, you'll need to set up the environment in refresh_cache.py
first.
By the way, responding to a user's request is considered lazy caching. This is pre-emptive caching. And don't forget to cache long enough for the page to be recreated!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With