I have designed a trading application that handles customers stocks investment portfolio.
I am using two datastore kinds:
db.Model python modules:
class Stocks (db.Model):
stockname = db.StringProperty(multiline=True)
dailyPercentChange=db.FloatProperty(default=1.0)
class UserTransactions (db.Model):
buyer = db.UserProperty()
value=db.FloatProperty()
stockref = db.ReferenceProperty(Stocks)
Once an hour I need to update the database: update the daily percent change in Stocks and then update the value of all entities in UserTransactions that refer to that stock.
The following python module iterates over all the stocks, update the dailyPercentChange property, and invoke a task to go over all UserTransactions entities which refer to the stock and update their value:
Stocks.py
# Iterate over all stocks in datastore
for stock in Stocks.all():
# update daily percent change in datastore
db.run_in_transaction(updateStockTxn, stock.key())
# create a task to update all user transactions entities referring to this stock
taskqueue.add(url='/task', params={'stock_key': str(stock.key(), 'value' : self.request.get ('some_val_for_stock') })
def updateStockTxn(stock_key):
#fetch the stock again - necessary to avoid concurrency updates
stock = db.get(stock_key)
stock.dailyPercentChange= data.get('some_val_for_stock') # I get this value from outside
... some more calculations here ...
stock.put()
Task.py (/task)
# Amount of transaction per task
amountPerCall=10
stock=db.get(self.request.get("stock_key"))
# Get all user transactions which point to current stock
user_transaction_query=stock.usertransactions_set
cursor=self.request.get("cursor")
if cursor:
user_transaction_query.with_cursor(cursor)
# Spawn another task if more than 10 transactions are in datastore
transactions = user_transaction_query.fetch(amountPerCall)
if len(transactions)==amountPerCall:
taskqueue.add(url='/task', params={'stock_key': str(stock.key(), 'value' : self.request.get ('some_val_for_stock'), 'cursor': user_transaction_query.cursor() })
# Iterate over all transaction pointing to stock and update their value
for transaction in transactions:
db.run_in_transaction(updateUserTransactionTxn, transaction.key())
def updateUserTransactionTxn(transaction_key):
#fetch the transaction again - necessary to avoid concurrency updates
transaction = db.get(transaction_key)
transaction.value= transaction.value* self.request.get ('some_val_for_stock')
db.put(transaction)
The problem:
Currently the system works great, but the problem is that it is not scaling well… I have around 100 Stocks with 300 User Transactions, and I run the update every hour. In the dashboard, I see that the task.py takes around 65% of the CPU (Stock.py takes around 20%-30%) and I am using almost all of the 6.5 free CPU hours given to me by app engine. I have no problem to enable billing and pay for additional CPU, but the problem is the scaling of the system… Using 6.5 CPU hours for 100 stocks is very poor.
I was wondering, given the requirements of the system as mentioned above, if there is a better and more efficient implementation (or just a small change that can help with the current implemntation) than the one presented here.
Thanks!!
Joel
There are several obvious improvements to be made:
Queue object's .add method, documented here. This is more efficient than adding tasks individually..fetch and cursors, rather than iterating; iterating fetches in small batches of 20 entities.UserTransaction entities after they're originally written? If so, you can skip the transaction and update them in batches.Finally, I'd suggest an overall refactoring: instead of starting a new task for each stock, run the outer loop inside a task, with the timer mentioned above. When you chain the next task, use cursors to pass the current state and pick up where you left off.
The only other thing to consider is if there's some way you can restructure your data to avoid the need for so many updates. Can you, for instance, make the UserTransaction entities reference some value in the Stock entities, so that you can calculate their actual value at runtime, and you only need to update the single Stock entity with the change?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With