I'm just getting started with Google App Engine so I'm still learning how to configure everything. I wrote a script called parsexml.py that I want to run every 10 minutes or so. This file is in my main directory, alongside main.py, app.yaml, etc. As I understand it, I need to create a new file, cron.yaml which looks like this:
cron:
- description: scrape xml
url: /
schedule: every 10 minutes
I'm not sure what I need to put in the url field. I'm also not sure if anything else is needed. Do I need to change my app.yaml file at all? Where do I specify the name of my parsexml.py file?
*/5 * * * * Execute a cron job every 5 minutes. 0 * * * * Execute a cron job every hour.
* * * * * is a cron schedule expression wildcard, meaning your cron job should run every minute of every hour of every day of every month, each day of the week.
Brian,
You'll need to update both your app.yaml
and cron.yaml
files. In each of these, you'll need to specify the path where the script will run.
app.yaml
:
handlers:
- url: /path/to/cron
script: parsexml.py
or if you have a catch all handler you won't need to change it. For example:
handlers:
- url: /.*
script: parsexml.py
cron.yaml
:
cron:
- description: scrape xml
url: /path/to/cron
schedule: every 10 minutes
As in the documentation, in parsexml.py
you'll need to specify a handler for /path/to/cron
and register it with a WSGI handler (or you could use CGI):
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
class ParseXMLHandler(webapp.RequestHandler):
def get(self):
# do something
application = webapp.WSGIApplication([('/path/to/cron', ParseXMLHandler)],
debug=True)
if __name__ == '__main__':
run_wsgi_app(application)
Note: If you are using the Python 2.7 runtime, you will want to specify script: parsexml.application
where application
is a global WSGI variable for handling requests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With