I have a very simple python routine that involves cycling through a list of roughly 20,000 latitude,longitude coordinates and calculating the distance of each point to a reference point.
def compute_nearest_points( lat, lon, nPoints=5 ):
"""Find the nearest N points, given the input coordinates."""
points = session.query(PointIndex).all()
oldNearest = []
newNearest = []
for n in xrange(nPoints):
oldNearest.append(PointDistance(None,None,None,99999.0,99999.0))
newNearest.append(obj2)
#This is almost certainly an inappropriate use of deepcopy
# but how SHOULD I be doing this?!?!
for point in points:
distance = compute_spherical_law_of_cosines( lat, lon, point.avg_lat, point.avg_lon )
k = 0
for p in oldNearest:
if distance < p.distance:
newNearest[k] = PointDistance(
point.point, point.kana, point.english, point.avg_lat, point.avg_lon, distance=distance
)
break
else:
newNearest[k] = deepcopy(oldNearest[k])
k += 1
for j in range(k,nPoints-1):
newNearest[j+1] = deepcopy(oldNearest[j])
oldNearest = deepcopy(newNearest)
#We're done, now print the result
for point in oldNearest:
print point.station, point.english, point.distance
return
I initially wrote this in C, using the exact same approach, and it works fine there, and is basically instantaneous for nPoints<=100. So I decided to port it to python because I wanted to use SqlAlchemy to do some other stuff.
I first ported it without the deepcopy statements that now pepper the method, and this caused the results to be 'odd', or partially incorrect, because some of the points were just getting copied as references(I guess? I think?) -- but it was still pretty nearly as fast as the C version.
Now with the deepcopy calls added, the routine does it's job correctly, but it has incurred an extreme performance penalty, and now takes several seconds to do the same job.
This seems like a pretty common job, but I'm clearly not doing it the pythonic way. How should I be doing this so that I still get the correct results but don't have to include deepcopy everywhere?
EDIT:
I've hit on a much simpler and faster solution,
def compute_nearest_points2( lat, lon, nPoints=5 ):
"""Find the nearest N points, given the input coordinates."""
points = session.query(PointIndex).all()
nearest = []
for point in points:
distance = compute_spherical_law_of_cosines( lat, lon, point.avg_lat, point.avg_lon )
nearest.append(
PointDistance(
point.point, point.kana, point.english, point.avg_lat, point.avg_lon, distance=distance
)
)
nearest_points = sorted(nearest, key=lambda point: point.distance)[:nPoints]
for item in nearest_points:
print item.point, item.english, item.distance
return
So basically I'm just making a complete copy of the input and appending a new value - the distance from the reference point. Then I just apply 'sorted' to the resulting list, specifying that the sort key should be the distance property of the PointDistance object.
This is much faster than using deepcopy although I confess I don't really understand why. I guess it is down to the efficient C implementations python's "sorted"?
Okay, simplest things first:
deepcopy
is slow in general since it has to do a lot of internal bookkeeping to copy pathological cases like objects containing themselves in a sane way. See, for instance, this page, or take a look at the source code of deepcopy
in copy.py
that is somewhere in your Python path.
sorted
is fast, since it is implemented in C. Much faster than an equivalent sort in Python.
Now, some more thing about Python's reference counting behaviour as you asked in your comment. In Python, variables are references. When you say a=1
, think about it has having 1
as an object existing on its own, and a
is just a tag attached to it. In some other languages like C, variables are containers (not tags), and when you do a=1
, you actually put 1 into a
. This does not hold for Python, where variables are references. This has some interesting consequences that you may have also stumbled upon:
>>> a = [] # construct a new list, attach a tag named "a" to it
>>> b = a # attach a tag named "b" to the object which is tagged by "a"
>>> a.append(1) # append 1 to the list tagged by "a"
>>> print b # print the list tagged by "b"
[1]
This behaviour is seen because lists are mutable objects: you can modify a list after you have created it, and the modification is seen when accessing the list through any of the variables that refer to it. The immutable equivalents of lists are tuples:
>>> a = () # construct a new tuple, attach a tag named "a" to it
>>> b = a # now "b" refers to the same empty tuple as "a"
>>> a += (1, 2) # appending some elements to the tuple
>>> print b
()
Here, a += (1, 2)
creates a new tuple from the existing tuple referred to by a
, plus another tuple (1, 2)
that is constructed on-the-fly, and a
is adjusted to point to the new tuple, while of course b
still refers to the old tuple. The same happens with simple numeric additions like a = a+2
: in this case, the number originally pointed to by a
is not mutated in any way, Python "constructs" a new number and moves a
to point to the new number. So, in a nutshell: numbers, strings and tuples are immutable; lists, dicts and sets are mutable. User-defined classes are in general mutable unless you ensure explicitly that the internal state cannot be mutated. And there's frozenset
, which is an immutable set. Plus many others of course :)
I don't know why your original code didn't work, but probably you hit a behaviour related to the code snippet I've shown with the lists as your PointDistance
class is also mutable by default. An alternative could be the namedtuple
class from collections
, which constructs a tuple-like object whose fields can also be accessed by names. For instance, you could have done this:
from collections import namedtuple
PointDistance = namedtuple("PointDistance", "point distance")
This creates a PointDistance
class for you that has two named fields: point
and distance
. In your main for
loop, you can assign these appropriately. Since the point objects pointed to by the point
fields won't be modified during the course of your for
loop, and distance
is a number (which is, by definition, immutable), you should be safe doing this way. But yes, in general, it seems like simply using sorted
is faster since sorted
is implemented in C. You might also be lucky with the heapq
module, which implements a heap data structure backed by an ordinary Python list, therefore it lets you find the top k
elements easily without having to sort the others. However, since heapq
is also implemented in Python, chances are that sorted
works better unless you have a whole lot of points.
Finally, I'd like to add that I never had to use deepcopy
so far, so I guess there are ways to avoid it in most cases.
I understand this doesn't directly address your question (and I know this is an old question), but since there is some discussion about performance it may be worthwhile to look at the append
operation. You may want to consider "pre-allocating" the array. For example:
array = [None] * num_elements
for i in range(num_elements):
array[i] = True
versus:
array = []
for i in range(num_elements):
array.append(True)
A simple timeit
run of these two methods shows a 25% improvement if you pre-allocate the array for even moderate values of num_elements
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With