I am doing a molecular dynamics simulation of an Argon liquid in Python. I have a stable version running, however it runs slowly for more than 100 atoms. I identified the bottleneck to be the following nested for loop. It's a force calculation put inside a function that is called from my main.py script:
def computeForce(currentPositions):
potentialEnergy = 0
force = zeros((NUMBER_PARTICLES,3))
for iParticle in range(0,NUMBER_PARTICLES-1):
for jParticle in range(iParticle + 1, NUMBER_PARTICLES):
distance = currentPositions[iParticle] - currentPositions[jParticle]
distance = distance - BOX_LENGTH * (distance/BOX_LENGTH).round()
#note: this is so much faster than scipy.dot()
distanceSquared = distance[0]*distance[0] + distance[1]*distance[1] + distance[2]*distance[2]
if distanceSquared < CUT_OFF_RADIUS_SQUARED:
r2i = 1. / distanceSquared
r6i = r2i*r2i*r2i
lennardJones = 48. * r2i * r6i * (r6i - 0.5)
force[iParticle] += lennardJones*distance
force[jParticle] -= lennardJones*distance
potentialEnergy += 4.* r6i * (r6i - 1.) - CUT_OFF_ENERGY
return(force,potentialEnergy)
the variables in CAPITAL letters are constant, defined in a config.py file. "currentPositions" is a 3 by number-of-particles matrix.
I have already implemented a version of the nested for loop with scipy.weave, which was inspired from this website: http://www.scipy.org/PerformancePython.
However, I don't like the loss of flexibility. I'm interested in "vectorizing" this for loop. I just don't really get how that works. Can anybody give me a clue or a good tutorial that teaches this?
Below is my vectorized version of your code. For a dataset with 1000 points my code is roughly 50 times faster then the original:
In [89]: xyz = 30 * np.random.uniform(size=(1000, 3))
In [90]: %timeit a0, b0 = computeForce(xyz)
1 loops, best of 3: 7.61 s per loop
In [91]: %timeit a, b = computeForceVector(xyz)
10 loops, best of 3: 139 ms per loop
The code:
from numpy import zeros
NUMBER_PARTICLES = 1000
BOX_LENGTH = 100
CUT_OFF_ENERGY = 1
CUT_OFF_RADIUS_SQUARED = 100
def computeForceVector(currentPositions):
potentialEnergy = 0
force = zeros((NUMBER_PARTICLES, 3))
for iParticle in range(0, NUMBER_PARTICLES - 1):
positionsJ = currentPositions[iParticle + 1:, :]
distance = currentPositions[iParticle, :] - positionsJ
distance = distance - BOX_LENGTH * (distance / BOX_LENGTH).round()
distanceSquared = (distance**2).sum(axis=1)
ind = distanceSquared < CUT_OFF_RADIUS_SQUARED
if ind.any():
r2i = 1. / distanceSquared[ind]
r6i = r2i * r2i * r2i
lennardJones = 48. * r2i * r6i * (r6i - 0.5)
ljdist = lennardJones[:, None] * distance[ind, :]
force[iParticle, :] += (ljdist).sum(axis=0)
force[iParticle+1:, :][ind, :] -= ljdist
potentialEnergy += (4.* r6i * (r6i - 1.) - CUT_OFF_ENERGY).sum()
return (force, potentialEnergy)
I've also checked that the code produces the same results
Writing something like a MD engine in pure python is going to be slow. I would take a look at either Numba (http://numba.pydata.org/) or Cython (http://cython.org/). On the Cython side, I've written a simple Langevin Dynamics engine using cython that might serve as an example to get you started:
https://bitbucket.org/joshua.adelman/pylangevin-integrator
Another option, which I like quite a bit, is to use OpenMM. There is a python wrapper that allows you to put together all of the pieces of an MD engine, implement custom forces, etc. It also has the ability to target GPU devices:
https://simtk.org/home/openmm
In general though, there are so many highly tuned MD codes available, that unless you are doing this for some sort of general educational purpose, it doesn't make sense to write your own from scratch. Some of the major codes, just to name a few:
just to make this post complete, I add my implementation in weaved in C code. note that you need to import weave and converters for this to work. moreover, weave only works with python 2.7 right now. thanks again for all the help! this runs another 10 times faster than the vectorized version.
from scipy import weave
from scipy.weave import converters
def computeForceC(currentPositions):
code = """
using namespace blitz;
Array<double,1> distance(3);
double distanceSquared, r2i, r6i, lennardJones;
double potentialEnergy = 0.;
for( int iParticle = 0; iParticle < (NUMBER_PARTICLES - 1); iParticle++){
for( int jParticle = iParticle + 1; jParticle < NUMBER_PARTICLES; jParticle++){
distance(0) = currentPositions(iParticle,0)-currentPositions(jParticle,0);
distance(0) = distance(0) - BOX_LENGTH * round(distance(0)/BOX_LENGTH);
distance(1) = currentPositions(iParticle,1)-currentPositions(jParticle,1);
distance(1) = distance(1) - BOX_LENGTH * round(distance(1)/BOX_LENGTH);
distance(2) = currentPositions(iParticle,2)-currentPositions(jParticle,2);
distance(2) = distance(2) - BOX_LENGTH * round(distance(2)/BOX_LENGTH);
distanceSquared = distance(0)*distance(0) + distance(1)*distance(1) + distance(2)*distance(2);
if(distanceSquared < CUT_OFF_RADIUS_SQUARED){
r2i = 1./distanceSquared;
r6i = r2i * r2i * r2i;
lennardJones = 48. * r2i * r6i * (r6i - 0.5);
force(iParticle,0) += lennardJones*distance(0);
force(iParticle,1) += lennardJones*distance(1);
force(iParticle,2) += lennardJones*distance(2);
force(jParticle,0) -= lennardJones*distance(0);
force(jParticle,1) -= lennardJones*distance(1);
force(jParticle,2) -= lennardJones*distance(2);
potentialEnergy += 4.* r6i * (r6i - 1.)-CUT_OFF_ENERGY;
}
}//end inner for loop
}//end outer for loop
return_val = potentialEnergy;
"""
#args that are passed into weave.inline and created inside computeForce
#potentialEnergy = 0.
force = zeros((NUMBER_PARTICLES,3))
#all args
arguments = ['currentPositions','force','NUMBER_PARTICLES','CUT_OFF_RADIUS_SQUARED','BOX_LENGTH','CUT_OFF_ENERGY']
#evaluate stuff in code
potentialEnergy = weave.inline(code,arguments,type_converters = converters.blitz,compiler = 'gcc')
return force, potentialEnergy
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With