I'm learning python and came across this example of a simulation of a model I've seen before. One of the functions looked unnecessarily long, so I thought it would be good practice to try to make it more efficient. My attempt, while requiring less code, is about 1/60th as fast. Yes, I made it 60 times worse.
My question is, where have I gone wrong? I've tried timing individual parts of the function and don't see where the bottleneck is.
Here's the original function. It's for a model where people live on a grid and their happiness depends on whether they're the same race as most of their neighbors. (It's Schelling's segregation model.) So we give it an x,y coordinate for a person and determine their happiness by checking the race of each of their neighbors.
def is_unhappy(self, x, y):
race = self.agents[(x,y)]
count_similar = 0
count_different = 0
if x > 0 and y > 0 and (x-1, y-1) not in self.empty_houses:
if self.agents[(x-1, y-1)] == race:
count_similar += 1
else:
count_different += 1
if y > 0 and (x,y-1) not in self.empty_houses:
if self.agents[(x,y-1)] == race:
count_similar += 1
else:
count_different += 1
if x < (self.width-1) and y > 0 and (x+1,y-1) not in self.empty_houses:
if self.agents[(x+1,y-1)] == race:
count_similar += 1
else:
count_different += 1
if x > 0 and (x-1,y) not in self.empty_houses:
if self.agents[(x-1,y)] == race:
count_similar += 1
else:
count_different += 1
if x < (self.width-1) and (x+1,y) not in self.empty_houses:
if self.agents[(x+1,y)] == race:
count_similar += 1
else:
count_different += 1
if x > 0 and y < (self.height-1) and (x-1,y+1) not in self.empty_houses:
if self.agents[(x-1,y+1)] == race:
count_similar += 1
else:
count_different += 1
if x > 0 and y < (self.height-1) and (x,y+1) not in self.empty_houses:
if self.agents[(x,y+1)] == race:
count_similar += 1
else:
count_different += 1
if x < (self.width-1) and y < (self.height-1) and (x+1,y+1) not in self.empty_houses:
if self.agents[(x+1,y+1)] == race:
count_similar += 1
else:
count_different += 1
if (count_similar+count_different) == 0:
return False
else:
return float(count_similar)/(count_similar+count_different) < self.similarity_threshold
And here is my code, which, as I said, is MUCH slower. I wanted to avoid having all the if statements above by just creating a list of "offsets" to add to each person's coordinates to determine the locations of possible neighbors, check if that is a valid position, and then check the neighbor's race.
def is_unhappy2(self, x, y):
thisRace = self.agents[(x,y)]
count_same = 0
count_other = 0
for xo, yo in list(itertools.product([-1,0,1],[-1,0,1])):
if xo==0 and yo==0:
# do nothing for case of no offset
next
else:
# check if there's a neighbor at the offset of (xo, yo)
neighbor = tuple(np.add( (x,y), (xo,yo) ))
if neighbor in self.agents.keys():
if self.agents[neighbor] == thisRace:
count_same += 1
else:
count_other += 1
if count_same+count_other == 0:
return False
else:
return float(count_same) / (count_same + count_other) < self.similarity threshold
(The rest of the code that creates the class is on the site where the example comes from.)
Here are the timing results:
%timeit s.is_unhappy2(49,42)
100 loops, best of 3: 5.99 ms per loop
%timeit s.is_unhappy(49,42)
10000 loops, best of 3: 103 µs per loop
I'm hoping someone with python knowledge can see immediately what I'm doing wrong without having to get into the nitty-gritty of the rest of the code. Can you see why my code is so much worse than the original?
Don't use np.add
, just use neighbor = (x+xo, y+yo)
. That should make it much faster (10x faster in my little test).
You can also...
if neighbor in self.agents:
without the .keys()
list
xo or yo
and not have an empty if-blockResult:
for xo, yo in itertools.product([-1,0,1],[-1,0,1]):
if xo or yo:
neighbor = self.agents.get((x+xo, y+yo))
if neighbor is not None:
if neighbor == thisRace:
count_same += 1
else:
count_other += 1
And you can add
self.neighbor_deltas = tuple(set(itertools.product([-1,0,1],[-1,0,1])) - {(0, 0)})
to the class initializer and then your function can just use those pre-computed deltas:
for xo, yo in self.neighbor_deltas:
neighbor = self.agents.get((x+xo, y+yo))
if neighbor is not None:
if neighbor == thisRace:
count_same += 1
else:
count_other += 1
Congrats for deciding to improve that author's ridiculously repetitive code, btw.
The culprit looks to be this line:
neighbor = tuple(np.add( (x,y), (xo,yo) ))
Changing it to this shows a huge speedup:
neighbor = (x + xo, y + yo)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With