I need this function to be optimized as I am trying to make my OpenGL simulation run faster. I want to use Parakeet, but I can't quite understand in what way I would need to modify the code below in order to do so. Can you see what I should do?
def distanceMatrix(self,x,y,z):
" ""Computes distances between all particles and places the result in a matrix such that the ij th matrix entry corresponds to the distance between particle i and j"" "
xtemp = tile(x,(self.N,1))
dx = xtemp - xtemp.T
ytemp = tile(y,(self.N,1))
dy = ytemp - ytemp.T
ztemp = tile(z,(self.N,1))
dz = ztemp - ztemp.T
# Particles 'feel' each other across the periodic boundaries
if self.periodicX:
dx[dx>self.L/2]=dx[dx > self.L/2]-self.L
dx[dx<-self.L/2]=dx[dx < -self.L/2]+self.L
if self.periodicY:
dy[dy>self.L/2]=dy[dy>self.L/2]-self.L
dy[dy<-self.L/2]=dy[dy<-self.L/2]+self.L
if self.periodicZ:
dz[dz>self.L/2]=dz[dz>self.L/2]-self.L
dz[dz<-self.L/2]=dz[dz<-self.L/2]+self.L
# Total Distances
d = sqrt(dx**2+dy**2+dz**2)
# Mark zero entries with negative 1 to avoid divergences
d[d==0] = -1
return d, dx, dy, dz
From what I can tell, Parakeet should be able to use the above function without modifications - it only uses Numpy and math. But, I always get the following error when calling the function from the Parakeet jit wrapper:
AssertionError: Unsupported function: <bound method Particles.distanceMatrix of <particles.Particles instance at 0x04CD8E90>>
Parakeet is still young, its NumPy support is incomplete, and your code touches on several features that don't yet work.
1) You're wrapping a method, while Parakeet so far only knows how to deal with functions. The common workaround is to make a @jit wrapped helper function and have your method call into that with all of the required member data. The reason that methods don't work is that it's non-trivial to assign a meaningful type to 'self'. It's not impossible, but tricky enough that methods won't make their way into Parakeet until lower hanging fruit are plucked. Speaking of low-hanging fruit...
2) Boolean indexing. Not yet implemented but will be in the next release.
3) np.tile: Also doesn't work, will also probably be in the next release. If you want to see which builtins and NumPy library functions will work, take a look at Parakeet's mappings module.
I rewrote your code to be a little friendlier to Parakeet:
@jit
def parakeet_dist(x, y, z, L, periodicX, periodicY, periodicZ):
# perform all-pairs computations more explicitly
# instead of tile + broadcasting
def periodic_diff(x1, x2, periodic):
diff = x1 - x2
if periodic:
if diff > (L / 2): diff -= L
if diff < (-L/2): diff += L
return diff
dx = np.array([[periodic_diff(x1, x2, periodicX) for x1 in x] for x2 in x])
dy = np.array([[periodic_diff(y1, y2, periodicY) for y1 in y] for y2 in y])
dz = np.array([[periodic_diff(z1, z2, periodicZ) for z1 in z] for z2 in z])
d= np.sqrt(dx**2 + dy**2 + dz**2)
# since we can't yet use boolean indexing for masking out zero distances
# have to fall back on explicit loops instead
for i in xrange(len(x)):
for j in xrange(len(x)):
if d[i,j] == 0: d[i,j] = -1
return d, dx, dy, dz
On my machine this runs only ~3x faster than NumPy for N = 2000 (0.39s for NumPy vs. 0.14s for Parakeet). If I rewrite the array traversals to use loops more explicitly then the performance goes up to ~6x faster than NumPy (Parakeet runs in ~0.06s):
@jit
def loopy_dist(x, y, z, L, periodicX, periodicY, periodicZ):
N = len(x)
dx = np.zeros((N,N))
dy = np.zeros( (N,N) )
dz = np.zeros( (N,N) )
d = np.zeros( (N,N) )
def periodic_diff(x1, x2, periodic):
diff = x1 - x2
if periodic:
if diff > (L / 2): diff -= L
if diff < (-L/2): diff += L
return diff
for i in xrange(N):
for j in xrange(N):
dx[i,j] = periodic_diff(x[j], x[i], periodicX)
dy[i,j] = periodic_diff(y[j], y[i], periodicY)
dz[i,j] = periodic_diff(z[j], z[i], periodicZ)
d[i,j] = dx[i,j] ** 2 + dy[i,j] ** 2 + dz[i,j] ** 2
if d[i,j] == 0: d[i,j] = -1
else: d[i,j] = np.sqrt(d[i,j])
return d, dx, dy, dz
With a little creative rewriting you can also get the above code running in Numba, but it only goes ~1.5x faster than NumPy (0.25 seconds). The compile times were Parakeet w/ comprehensions: 1 second, Parakeet w/ loops: .5 seconds, Numba w/ loops: 0.9 seconds.
Hopefully the next few releases will enable more idiomatic use of NumPy library functions, but for now comprehensions or loops are often the way to go.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With