Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

caching mysql query returned by Python script

I have this python '2.6.4' script:

#!/usr/bin/python
import MySQLdb
import csv
import pprint

db = MySQLdb.connect(host="localhost", # The Host
                     user="username", # username
                      passwd="password", # password
                      db="dbname") # name of the data base

cursor = db.cursor() 

cursor.execute("SELECT name, id, city, storeid FROM Products;")

StudentsData = cursor.fetchall()
pprint.pprint(StudentsData)

this query return millions of rows, and it takes 3 minutes to execute, and it run 20 times a day, and everyuser has to wait three minutes to fetch the output.

I was thinking about caching it, but I don't know how to do that in python, will that be a good idea, if so how can I do it, using the obove code.

please help me if there's a better way to do this?

like image 630
mongotop Avatar asked Feb 16 '23 23:02

mongotop


1 Answers

The decision of whether to cache is entirely up to you. Will stale data be a problem?

The simplest caching scheme I can think of is something like the following using Python's pickle module:

import MySQLdb
import csv
import pprint
import time

MAX_CACHE_AGE = 60*20  # 20 Minutes
CACHE_FILENAME = 'results.cache'

with open(CACHE_FILENAME, 'r') as cache:
    cached = pickle.load(cache)

if(time.time() > cached['timestamp'] + MAX_CACHE_AGE):
    # Cache too old, run query
    db = MySQLdb.connect(host="localhost", # The Host
                     user="username", # username
                     passwd="password", # password
                     db="dbname") # name of the data base

    cursor = db.cursor() 
    cursor.execute("SELECT name, id, city, storeid FROM Products;")
    StudentsData = cursor.fetchall()

    # Update cache file
    data = {'results': StudentsData, 'timestamp':time.time()}
    with open(CACHE_FILENAME, 'w') as cache:
        pickle.dump(data, cache)

else:
    # Cached data is fresh enough, use that
    StudentsData = cached['results']

pprint.pprint(StudentsData)

You'll need to initialize the results.cache file once manually.


EDIT

The with syntax is a context manager and was introduced in Python 2.5.

with open(CACHE_FILENAME, 'r') as cache:
    cached = pickle.load(cache)

Can be rewritten as

cached = open(CACHE_FILENAME, 'r')
cached = pickle.load(cache)
cached.close()

EDIT2

After a long discussion in chat, the following works:

import MySQLdb
import csv
import pprint
import time
import pickle

MAX_CACHE_AGE = 60*20  # 20 Minutes
CACHE_FILENAME = 'results.cache'

regen = False
try:
    with open(CACHE_FILENAME, 'r') as cache:
        cached = pickle.load(cache)

    if(time.time() > cached['timestamp'] + MAX_CACHE_AGE):
        print("Cache too old: regenerating cache")
        regen = True
    else:
        print("Cached data is fresh enough: loading results from cache")

except IOError:
    print("Error opening %s: regenerating cache" % CACHE_FILENAME)
    regen = True

if(regen):
    # Cache too old, run query
    db = MySQLdb.connect(host="localhost", # The Host
                     user="username", # username
                     passwd="password", # password
                     db="dbname") # name of the data base

    cursor = db.cursor()
    cursor.execute("SELECT name, id, city, storeid FROM Products;")
    StudentsData = cursor.fetchall()
    cursor.close()

    # Update cache file
    data = {'results': StudentsData, 'timestamp':time.time()}
    with open(CACHE_FILENAME, 'w') as cache:
        pickle.dump(data, cache)

else:
    # Cached data is fresh enough, use that
    StudentsData = cached['results']


print StudentsData
like image 129
jedwards Avatar answered Mar 02 '23 00:03

jedwards