Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run a python script on schedule on Google App Engine

I'm looking for a good samaritan that can provide with a very basic skeleton to run a python script using Google App Engine. I have read the documentation, check on related SO questions but I'm lost with the WebApp format. All I want to do is run one python script which accepts arguments or several python scripts, 6 times a week to listen to for changes in a website and then post them to Firestore.

I understand the cron format and most of the configurations files. I'm stuck on how to arrange my files for the project, and how the url's works.

All I'm asking is a very basic sample on how to effectively run the python scripts. This is by far the best resource that I have found, but I can't really understand what is going on with this code from that site:

`#!/usr/bin/python
# -*- coding: utf-8 -*- 
from __future__ import unicode_literals   
from google.appengine.ext import webapp 
from google.appengine.ext.webapp.util import run_wsgi_app 
from google.appengine.ext import db   
import feedparser  
import time   

class Item(db.Model): 
    title = db.StringProperty(required=False)
    link = db.StringProperty(required=False)
    date = db.StringProperty(required=False)   class Scrawler(webapp.RequestHandler):
    
    def get(self):
        self.read_feed()      
        self.response.out.write(self.print_items())
        
    def read_feed(self):
        
        feeds = feedparser.parse( "http://www.techrepublic.com/search?t=14&o=1&mode=rss" )
        
        for feed in feeds[ "items" ]:
            query = Item.gql("WHERE link = :1", feed[ "link" ])
            if(query.count() == 0):
                item = Item()
                item.title = feed[ "title" ]
                item.link = feed[ "link" ]
                item.date = time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(time.time()))
                item.put()
    
    def print_items(self):
        s = "All items:<br>"
        for item in Item.all():
            s += item.date + " - <a href='" + item.link + "'>" + item.title + "</a><br>"
        return s   application = webapp.WSGIApplication([('/', Scrawler)], debug=True)   def main():
    run_wsgi_app(application)   if __name__ == "__main__":
    main() `

This is the python script I tried to run for testing only, using python3.7:

import sys
from datetime import datetime

import firebase_admin
from firebase_admin import firestore

app = firebase_admin.initialize_app()
db = firestore.client()


def hello_firestore(user_name):
    db.collection('firestore_test').document('test').set({
        'time': str(datetime.now()),
        'user_name': user_name
    })


if __name__ == "__main__":
    try:
        user_name = sys.argv[1]
    except:
        print('Error with the argument', file=sys.stderr)
    try:
        hello_firestore(user_name)
    except:
        print('Error accessing the database', file=sys.stderr)
        sys.exit(0)

For what I understand I have to use Flask or something similar to make it work, but I don't really understand how it works, all I'm asking is a small sample and and brief explanation, and from there I'll add two and two.

Best Regards

like image 536
Guanaco Devs Avatar asked Jan 17 '19 22:01

Guanaco Devs


2 Answers

Finally my kids will love me again. Turns out I was looking at the wrong GCP resource, as @Dan_Cornilescu pointed out that might be a way to do it, but the easiest way to do it is "Cloud Functions" in Conjunction with "Cloud Scheduler" and I found it just by mere chance.

This Article was the very first one that mentioned it, at the moment I passed on it because the autor again uses a web app to illustrate the case, for my needs and lack of technical argot, I just couldn't dig it. But it is really as simple as it was supposed to be, in your Google Cloud Console:

  1. Go to the Functions Section
  2. Choose as trigger "Cloud Pub/Sub"
  3. Add/Choose a topic
  4. Select your runtime(Python3.7 of course)
  5. Select function to execute
  6. Create
  7. Make sure you fill the "requirements.txt" file on the next tab
  8. Go to Cloud Scheduler section of GCP and Create a job(cron job)
  9. Choose as target: "Pub/Sub"
  10. Enter the topic you chose for your function
  11. If you want to send arguments for your functions, use the payload for that purpose.

To use an argument or arguments for your Python function you want to use the payload and using the following from their initial function:

pubsub_message = base64.b64decode(event['data']).decode('utf-8')

This pubsub_message you can use it as an argument for your python functions.

And that's all folks, easy, super easy, at the end I think is just the same of a GAE without the visual page, just what I was needed, I knew there's gotta be a better way.

EDIT: The article I mention here describe how to use gcloud to upload your function(s) directly from your computer.

enter image description here

like image 98
Guanaco Devs Avatar answered Sep 22 '22 18:09

Guanaco Devs


The answer I mentioned still applies - you won't be able to run your scripts in a standalone manner on GAE cron, simply because the cron service is really just a set of scheduled GET requests. You may be able to achieve the same end result, but by:

  • installing a skeleton app
  • breaking down your scripts into code that you'd stuff into the app's handlers, with arguments passed in the request's query strings
  • configuring the cron service to build and trigger those requests

You can find a Python 3 skeleton in Quickstart for Python 3 in the App Engine Standard Environment

Alternatively you could, of course, use an IaaS service instead of GAE, like Google Compute Engine, where you could run your scripts directly, with a traditional cron service.

like image 33
Dan Cornilescu Avatar answered Sep 22 '22 18:09

Dan Cornilescu