Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary where keys are pair of integers in Python

How is possible in Python to create a dictionary where the keys are pairs of integers?

For example, if I do this:

mydict=dict()
mydict[ [1,2] ] = 'xxx'

I get the error TypeError: unhashable type: 'list'

So I came up with two different solutions: strings or tuples as keys.

A first solution seems to convert the pair of integers in their string representation:

mydict=dict()
mydict[ str(1)+" "+str(2) ] = 'xxx'

while the second solution involves tuples:

mydict=dict()
mydict[ tuple([1,2]) ] = 'xxx'

From some experiments I've found that the tuple solution is slower than the string one. Is there a more efficient and fast way to use simply two integers as keys?

like image 960
linello Avatar asked Nov 14 '12 08:11

linello


1 Answers

You should probably use a tuple, which can be hashed:

mydict = {}
mydict[(1, 2)] = 'xxx'
# or more concisely (@JamesHenstridge):
mydict[1,2] = 'xxx'

If that is actually too slow (don't optimise unnecessarily), then given a maximum value for the one integer, construct an index:

def index(a, b, maxB):
    return a*maxB + b

mydict[index(1, 2, max)] = 'xxx'

But be aware that a function call could easily slow it down further, so you can inline the function at the cost of readability and making it easier to introduce bugs if copy-pasted elsewhere:

mydict[1*max + 2] = 'xxx'

Incidentally, there is an SO question on read speeds of dictionaries with tuple keys:

Python tuples as keys slow?

Doing a tiny bit of profiling showed the inline index to be marginally (<5%) faster than the tuple, and both about twice as fast as the index. If this was done in PyPy, I would expect the index version (inline or not) to be faster.

On a subsidiary note; if you are worrying about the insertion speed into a dict, you may be using the wrong data structure, or perhaps doing more work than necessary. As an example, parsing a CSV file into fields in each line and storing the values in a dict this way data[line,field] may be unnecessary if you can make the line parsing lazy and only parse the lines that you actually pull data out of. I.e. don't do data = parseAll(somecsv); print data[7,'date'] when you can do dataLines = somecsv.readlines(); print getField(dataLines[7], 'date').

like image 158
Phil H Avatar answered Sep 19 '22 13:09

Phil H