Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Small Tables in Python?

Let's say I don't have more than one or two dozen objects with different properties, such as the following:

UID, Name, Value, Color, Type, Location

I want to be able to call up all objects with Location = "Boston", or Type = "Primary". Classic database query type stuff.

Most table solutions (pytables, *sql) are really overkill for such a small set of data. Should I simply iterate over all the objects and create a separate dictionary for each data column (adding values to dictionaries as I add new objects)?

This would create dicts like this:

{'Boston' : [234, 654, 234], 'Chicago' : [324, 765, 342] } - where those 3 digit entries represent things like UID's.

As you can see, querying this would be a bit of a pain.

Any thoughts of an alternative?

like image 863
akoumjian Avatar asked Sep 24 '09 14:09

akoumjian


People also ask

Can you make tables in Python?

Python provides tabulate library to create tables and format them.

How do I find the size of a table in Python?

len(t) should work.


3 Answers

For small relational problems I love using Python's builtin sets.

For the example of location = 'Boston' OR type = 'Primary', if you had this data:

users = {
   1: dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   2: dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   3: dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   #...
}

You can do the WHERE ... OR ... query like this:

set1 = set(u for u in users if users[u]['Location'] == 'Boston')
set2 = set(u for u in users if users[u]['Type'] == 'Primary')
result = set1.union(set2)

Or with just one expression:

result = set(u for u in users if users[u]['Location'] == 'Boston'
                              or users[u]['Type'] == 'Primary')

You can also use the functions in itertools to create fairly efficient queries of the data. For example if you want to do something similar to a GROUP BY city:

cities = ('Boston', 'New York', 'Chicago')
cities_users = dict(map(lambda city: (city, ifilter(lambda u: users[u]['Location'] == city, users)), cities))

You could also build indexes manually (build a dict mapping Location to User ID) to speed things up. If this becomes too slow or unwieldy then I would probably switch to sqlite, which is now included in the Python (2.5) standard library.

like image 186
Steven Kryskalla Avatar answered Oct 31 '22 21:10

Steven Kryskalla


I do not think sqlite would be "overkill" -- it comes with standard Python since 2.5, so no need to install stuff, and it can make and handle databases in either memory or local disk files. Really, how could it be simpler...? If you want everything in-memory including the initial values, and want to use dicts to express those initial values, for example...:

import sqlite3

db = sqlite3.connect(':memory:')
db.execute('Create table Users (Name, Location, Type)')
db.executemany('Insert into Users values(:Name, :Location, :Type)', [
   dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   ])
db.commit()
db.row_factory = sqlite3.Row

and now your in-memory tiny "db" is ready to go. It's no harder to make a DB in a disk file and/or read the initial values from a text file, a CSV, and so forth, of course.

Querying is especially flexible, easy and sweet, e.g., you can mix string insertion and parameter substitution at will...:

def where(w, *a):
  c = db.cursor()
  c.execute('Select * From Users where %s' % w, *a)
  return c.fetchall()

print [r["Name"] for r in where('Type="Secondary"')]

emits [u'Mr. Foo', u'Mr. Quux'], just like the more elegant but equivalent

print [r["Name"] for r in where('Type=?', ["Secondary"])]

and your desired query's just:

print [r["Name"] for r in where('Location="Boston" or Type="Primary"')]

etc. Seriously -- what's not to like?

like image 25
Alex Martelli Avatar answered Oct 31 '22 21:10

Alex Martelli


If it's really a small amount of data, I'd not bother with an index and probably just write a helper function:

users = [
   dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   ]

def search(dictlist, **kwargs):
   def match(d):
      for k,v in kwargs.iteritems():
         try: 
            if d[k] != v: 
               return False
         except KeyError:
            return False
      return True

   return [d for d in dictlist if match(d)] 

Which will allow nice looking queries like this:

result = search(users, Type="Secondary")
like image 2
Kenan Banks Avatar answered Oct 31 '22 21:10

Kenan Banks