Python: for loop in index assignment

Question

While working through the awesome book "Programming Collective Intelligence", by Toby Segaran, I've encountered some techniques in index assignments I'm not entirely familiar with.

Take this for example:

createkey='_'.join(sorted([str(wi) for wi in wordids]))

or:

normalizedscores = dict([(u,float(l)/maxscore) for (u,l) in linkscores.items()])

All the nested tuples in the indexes have me a bit confused. What is actually being assigned to these varibles? I assumed obviously the .join one comes out as a string, but what about the latter? If someone could explain the mechanics of these loops I'd really appreciate it. I assume these are pretty common techniques, but being new to Python, I suppose to ask is a moment's shame. Thanks!

Tim Pietzcker · Accepted Answer

[str(wi) for wi in wordids]

is a list comprehension.

a = [str(wi) for wi in wordids]

is the same as

a = []
for wi in wordids:
    a.append(str(wi))

So

createkey='_'.join(sorted([str(wi) for wi in wordids]))

creates a list of strings from each item in wordids, then sorts that list and joins it into a big string using _ as a separator.

As agf rightly noted, you can also use a generator expression, which looks just like a list comprehension but with parentheses instead of brackets. This avoids construction of a list if you don't need it later (except for iterating over it). And if you already have parentheses there like in this case with sorted(...) you can simply remove the brackets.

However, in this special case you won't be getting a performance benefit (in fact, it'll be about 10 % slower; I timed it) because sorted() will need to build a list anyway, but it looks a bit nicer:

createkey='_'.join(sorted(str(wi) for wi in wordids))

normalizedscores = dict([(u,float(l)/maxscore) for (u,l) in linkscores.items()])

iterates through the items of the dictionary linkscores, where each item is a key/value pair. It creates a list of key/l/maxscore tuples and then turns that list back into a dictionary.

However, since Python 2.7, you could also use dict comprehensions:

normalizedscores = {u:float(l)/maxscore for (u,l) in linkscores.items()}

Here's some timing data:

Python 3.2.2

>>> import timeit
>>> timeit.timeit(stmt="a = '_'.join(sorted([str(x) for x in n]))", setup="import random; n = [random.randint(0,1000) for i in range(100)]")
61.37724242267409
>>> timeit.timeit(stmt="a = '_'.join(sorted(str(x) for x in n))", setup="import random; n = [random.randint(0,1000) for i in range(100)]")
66.01814811313774

Python 2.7.2

>>> import timeit
>>> timeit.timeit(stmt="a = '_'.join(sorted([str(x) for x in n]))", setup="import random; n = [random.randint(0,1000) for i in range(100)]")
58.01728623923137
>>> timeit.timeit(stmt="a = '_'.join(sorted(str(x) for x in n))", setup="import random; n = [random.randint(0,1000) for i in range(100)]")
60.58927580777687

NPE · Answer

Let's take the first one:

str(wi) for wi in wordids takes each element in wordids and converts it to string.
sorted(...) sorts them (lexicographically).
'_'.join(...) merges the sorted word ids into a single string with underscores between entries.

Now the second one:

normalizedscores = dict([(u,float(1)/maxscore) for (u,l) in linkscores.items()])

linkscores is a dictionary (or a dictionary-like object).
for (u,l) in linkscores.items() iterates over all entries in the dictionary, for each entry assigning the key and the value to u and l.
(u,float(1)/maxscore) is a tuple, the first element of which is u and the second element is 1/maxscore (to me, this looks like it might be a typo: float(l)/maxscore would make more sense -- note the lowercase letter el in place of one).
dict(...) constructs a dictionary from the list of tuples, where the first element of each tuple is taken as the key and the second is taken as the value.

In short, it makes a copy of the dictionary, preserving the keys and dividing each value by maxscore.

Python: for loop in index assignment

Tags:

python

dictionary

indexing

variable-assignment

DeaconDesperado

2 Answers

Tim Pietzcker

NPE

Recent Activity

Donate For Us

Python: for loop in index assignment

Tags:

python

dictionary

indexing

variable-assignment

DeaconDesperado

2 Answers

Tim Pietzcker

NPE

Related questions

Recent Activity

Donate For Us