I currently have a structure that is a dict: each value is a list that contains numeric values. Each of these numeric lists contain what (to borrow a SQL idiom) you could call a primary key containing the first three values which are: a year, a player identifier, and a team identifier. This is the key for the dict.
So you can get a unique row by passing the a value in for the year, player ID, and team ID like so:
statline = stats[(2001, 'SEA', 'suzukic01')]
Which yields something like
[305, 20, 444, 330, 45]
I'd like to alter this data structure to be quickly summed by either of these three keys: so you could easily slice the totals for a given index in the numeric lists by passing in ONE of year, player ID, and team ID, and then the index. I want to be able to do something like
hr_total = stats[year=2001, idx=3]
Where that idx of 3 corresponds to the third column in the numeric list(s) that would be retrieved.
Any ideas?
Read up on Data Warehousing. Any book.
Read up on Star Schema Design. Any book. Seriously.
You have several dimensions: Year, Player, Team.
You have one fact: score
You want to have a structure like this.
You then want to create a set of dimension indexes like this.
years = collections.defaultdict( list )
players = collections.defaultdict( list )
teams = collections.defaultdict( list )
Your fact table can be this a collections.namedtuple
. You can use something like this.
class ScoreFact( object ):
def __init__( self, year, player, team, score ):
self.year= year
self.player= player
self.team= team
self.score= score
years[self.year].append( self )
players[self.player].append( self )
teams[self.team].append( self )
Now you can find all items in a given dimension value. It's a simple list attached to a dimension value.
years['2001'] are all scores for the given year.
players['SEA'] are all scores for the given player.
etc. You can simply use sum()
to add them up. A multi-dimensional query is something like this.
[ x for x in players['SEA'] if x.year == '2001' ]
Put your data into SQLite, and use its relational engine to do the work. You can create an in-memory database and not even have to touch the disk.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With