I am working on some simulation work for my research and have run into a snag importing fortran into my python scripts. As background, I have worked with Python for some years now, and have only toyed around inside of Fortran when the need arose.
I have done some work in the past with Fortran implementing some simple OpenMP functionality. I am no expert in this, but I have gotten the basics going before.
I am now using f2py to create a library I can call on from my python script. When I try to compile openmp, it compiles correctly and runs to completion, but there is zero improvement in speed and looking at top I see that the CPU usage indicates that only one thread is running.
I've scoured the documentation for f2py (which is not very well documented) as well as done the normal web sleuthing for answers. I've included the Fortran code I am compiling as well as a simple python script that calls on it. I also am throwing in the compile command I am using.
Currently I cut down the simulation to 10^4 as a nice benchmark. On my system it takes 3 seconds to run. Ultimately I need to run a number of 10^6 particle simulations though, so I need to bring down the time a bit.
If anyone can point me in the direction of how to get my code working, it would be super appreciated. I can also try to include any detailed information about the system as needed.
Cheers, Rylkan
1) Compile
f2py -c --f90flags='-fopenmp' -lgomp -m calc_accel_jerk calc_accel_jerk.f90
2) Python script to call
import numpy as N
import calc_accel_jerk
# a is a (1e5,7) array with M,r,v information
a = N.load('../test.npy')
a = a[:1e4]
out = calc_accel_jerk.calc(a,a.shape[0])
print out[:10]
3) Fortran code
subroutine calc (input_array, nrow, output_array)
implicit none
!f2py threadsafe
include "omp_lib.h"
integer, intent(in) :: nrow
double precision, dimension(nrow,7), intent(in) :: input_array
double precision, dimension(nrow,2), intent(out) :: output_array
! Calculation parameters with set values
double precision,parameter :: psr_M=1.55*1.3267297e20
double precision,parameter :: G_Msun=1.3267297e20
double precision,parameter :: pc_to_m=3.08e16
! Vector declarations
integer :: irow
double precision :: vfac
double precision, dimension(nrow) :: drx,dry,drz,dvx,dvy,dvz,rmag,jfac,az,jz
! Break up the input array for faster access
double precision,dimension(nrow) :: input_M
double precision,dimension(nrow) :: input_rx
double precision,dimension(nrow) :: input_ry
double precision,dimension(nrow) :: input_rz
double precision,dimension(nrow) :: input_vx
double precision,dimension(nrow) :: input_vy
double precision,dimension(nrow) :: input_vz
input_M(:) = input_array(:,1)*G_Msun
input_rx(:) = input_array(:,2)*pc_to_m
input_ry(:) = input_array(:,3)*pc_to_m
input_rz(:) = input_array(:,4)*pc_to_m
input_vx(:) = input_array(:,5)*1000
input_vy(:) = input_array(:,6)*1000
input_vz(:) = input_array(:,7)*1000
!$OMP PARALLEL DO private(vfac,drx,dry,drz,dvx,dvy,dvz,rmag,jfac,az,jz) shared(output_array) NUM_THREADS(2)
DO irow = 1,nrow
! Get the i-th iteration
vfac = sqrt(input_M(irow)/psr_M)
drx = (input_rx-input_rx(irow))
dry = (input_ry-input_ry(irow))
drz = (input_rz-input_rz(irow))
dvx = (input_vx-input_vx(irow)*vfac)
dvy = (input_vy-input_vy(irow)*vfac)
dvz = (input_vz-input_vz(irow)*vfac)
rmag = sqrt(drx**2+dry**2+drz**2)
jfac = -3*drz/(drx**2+dry**2+drz**2)
! Calculate the acceleration and jerk
az = input_M*(drz/rmag**3)
jz = (input_M/rmag**3)*((dvx*drx*jfac)+(dvy*dry*jfac)+(dvz+dvz*drz*jfac))
! Remove bad index
az(irow) = 0
jz(irow) = 0
output_array(irow,1) = sum(az)
output_array(irow,2) = sum(jz)
END DO
!$OMP END PARALLEL DO
END subroutine calc
Here is a simple check to see, wether the OpenMP threads indeed are visible within the Fortran code:
module OTmod
!$ use omp_lib
implicit none
public :: get_threads
contains
function get_threads() result(nt)
integer :: nt
nt = 0
!$ nt = omp_get_max_threads()
end function get_threads
end module OTmod
Compilation:
> f2py -m OTfor --fcompiler=gfortran --f90flags='-fopenmp' -lgomp -c OTmod.f90
Execution:
> python
>>> from OTfor import otmod
>>> otmod.get_threads()
12
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With