Optimizing Python distance calculation while accounting for periodic boundary conditions

Question

I have written a Python script to calculate the distance between two points in 3D space while accounting for periodic boundary conditions. The problem is that I need to do this calculation for many, many points and the calculation is quite slow. Here is my function.

def PBCdist(coord1,coord2,UC):
    dx = coord1[0] - coord2[0]
    if (abs(dx) > UC[0]*0.5):
       dx = UC[0] - dx
    dy = coord1[1] - coord2[1]
    if (abs(dy) > UC[1]*0.5):
       dy = UC[1] - dy
    dz = coord1[2] - coord2[2]
    if (abs(dz) > UC[2]*0.5):
       dz = UC[2] - dz
    dist = np.sqrt(dx**2 + dy**2 + dz**2)
    return dist

I then call the function as so

for i, coord2 in enumerate(coordlist):
  if (PBCdist(coord1,coord2,UC) < radius):
      do something with i

Recently I read that I can greatly increase performance by using list comprehension. The following works for the non-PBC case, but not for the PBC case

coord_indices = [i for i, y in enumerate([np.sqrt(np.sum((coord2-coord1)**2)) for coord2 in coordlist]) if y < radius]
for i in coord_indices:
   do something

Is there some way to do the equivalent of this for the PBC case? Is there an alternative that would work better?

Sven Marnach · Accepted Answer

You should write your distance() function in a way that you can vectorise the loop over the 5711 points. The following implementation accepts an array of points as either the x0 or x1 parameter:

def distance(x0, x1, dimensions):
    delta = numpy.abs(x0 - x1)
    delta = numpy.where(delta > 0.5 * dimensions, delta - dimensions, delta)
    return numpy.sqrt((delta ** 2).sum(axis=-1))

Example:

>>> dimensions = numpy.array([3.0, 4.0, 5.0])
>>> points = numpy.array([[2.7, 1.5, 4.3], [1.2, 0.3, 4.2]])
>>> distance(points, [1.5, 2.0, 2.5], dimensions)
array([ 2.22036033,  2.42280829])

The result is the array of distances between the points passed as second parameter to distance() and each point in points.

Lauritz V. Thaulow · Answer

import numpy as np

bounds = np.array([10, 10, 10])
a = np.array([[0, 3, 9], [1, 1, 1]])
b = np.array([[2, 9, 1], [5, 6, 7]])

min_dists = np.min(np.dstack(((a - b) % bounds, (b - a) % bounds)), axis = 2)
dists = np.sqrt(np.sum(min_dists ** 2, axis = 1))

Here a and b are lists of vectors you wish to calculate the distance between and bounds are the boundaries of the space (so here all three dimensions go from 0 to 10 and then wrap). It calculates the distances between a[0] and b[0], a[1] and b[1], and so on.

I'm sure numpy experts could do better, but this will probably be an order of magnitude faster than what you're doing, since most of the work is now done in C.

Optimizing Python distance calculation while accounting for periodic boundary conditions

Tags:

python

optimization

list-comprehension

johnjax

2 Answers

Sven Marnach

Lauritz V. Thaulow

Recent Activity

Donate For Us

Optimizing Python distance calculation while accounting for periodic boundary conditions

Tags:

python

optimization

list-comprehension

johnjax

2 Answers

Sven Marnach

Lauritz V. Thaulow

Related questions

Recent Activity

Donate For Us