I have a rather complicated physics-model (Thomas Hopfield model from https://www.researchgate.net/publication/243703523_Kinetics_of_Radiative_Recombination_at_Randomly_Distributed_Donors_and_Acceptors).
For the sake of later fitting it to experimental data, I first simply want to graphically plot it and am already running into difficulties. The model reacts very sensible to changes in the integration limit and I am not sure, whether the vectorization-function creates problems with the integration.
The model consists of two analytically not-solvable integrals from 0 to infinity, and I want to plot them using the scipy.integrate.quad function.
(example code below)
As expected, integrating to infinity does not work, so I integrated up to a set value (r_max). However, the resulting function is very sensible to changes in the r_max. When increasing it (e.g. from 250e-9 to 250e-8), the resulting function gets stretched. That poses a problem, as it seems unjustified to then take the highest computable r_max and assume it similar to infinity. (What is even weirder, is that there is a theoretical value for t=0, which is most closely produced using r_max = 250e-9 and not using higher values "closer to infinity".)
#### Script for numerically integrating and plotting the Thomas-Hopfield-Model
#-------------- IMPORTING LIBRARIES --------------
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import quad
#-------------- DEFINING MODEL FUNCTION ----------------
# The physical model has three parameters (W_0, R_0, n),
# in order to fit it onto experimental data later, two constants (A, B) have been added.
# it contains two integrals over the variable r from 0 to infinity, but due to overflow issues have here been replaced by integrals from 0 to a set value r_max.
# The overall model is a function of time t.
def W(r,W_0,R_0): # exponential decay function for the recombination rate of electron-hole pair
return W_0 * np.exp( (-2 * r) / R_0 )
def integral1(r,W_0,R_0,t): # First integral (over r) for the Thomas-Hopfield model
return W(r,W_0,R_0) * np.exp( -W(r,W_0,R_0) * t ) * r**2
def integral2(r,W_0,R_0,t): # second integral (over r) for the Thomas-Hopfield model (in the exponent)
return ( np.exp( -W(r,W_0,R_0) *t ) - 1 ) *r**2
def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
r_max = 250e-7 #np.inf #UPPER INTEGRATION BOUND
int1 = quad( integral1, 0, r_max, args=(W_0,R_0,t) )[0]
int2 = quad( integral2, 0, r_max, args=(W_0,R_0,t) )[0]
lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
return lum_int
thomas_hopfield_v = np.vectorize(thomas_hopfield) # this has been added, as trying to plot it without vectorization resulted in the error: "TypeError: only size-1 arrays can be converted to Python scalars"
#--------------- EXAMPLE VALUES FOR PARAMETERS ---------------:
W_0 = 1e6
R_0 = 1.2e-9
n = 3e-19
A = 1
B = 0
#---------------------------------------------------------------
#--------- SIDE QUEST: theoretical value for t=0 --------------
theoretical_value = n*np.pi*W_0*R_0**3 # literature theoretical value for t=0
thomas_hopfield_t0 = thomas_hopfield(0,W_0,R_0,n,A,B) # calculate the theoretical value for t=0
print('Theoretical value for t=0:', theoretical_value)
print('Calculated value for t=0:', thomas_hopfield_t0) # print the calculated value for t=0
#-----------------------------------------------------------------
#----------------------------------------------------------------
#--------------- GRAPHICAL REPRESENTATION-------------
t = np.linspace(0,100000000, 100000) # create time value array
y_model = thomas_hopfield_v(t,W_0,R_0,n,A,B) # apply the model on the time t
plt.plot(t, y_model, color='r',linestyle='-',linewidth=0.7, alpha=1, label='Thomas Hopfield Model',zorder=1)
plt.yscale('log')
plt.xscale('log')
plt.show()
As someone using scipy.integrate.quad for the first time, I am asking myself how quad is computing the result (in which order it takes in the array, computes the integrals and returns the values) and whether or not the vectorization np.vectorize() of the function is the (or a) problem. I had to include np.vectorize(), as I would otherwise receive "TypeError: only size-1 arrays can be converted to Python scalars", but it makes it almost impossible for me to follow and comprehend the individual steps of the programme.
Does someone have experience with that? It would already help me to know, whether the vectorisation creates the problems and / or how I could plot the function without using it. Thank you very much in advance!
As someone using
scipy.integrate.quadfor the first time, I am asking myself howquadis computing the result (in which order it takes in the array, computes the integrals and returns the values)
Imagine that you wanted to integrate the normal distribution's probability density function from -1000 to 1000. This works fine.
import scipy
import numpy as np
centered_at_zero = scipy.stats.norm(loc=0, scale=1)
result, eps = scipy.integrate.quad(centered_at_zero.pdf, -1000, 1000)
print("result", result, "error", eps)
Output:
result 1.0 error 1.820767988928941e-12
Now, let's try something similar. Let's try integrating the same distribution, but have the distribution centered at 100. Theoretically, this shouldn't change anything.
import scipy
import numpy as np
centered_at_100 = scipy.stats.norm(loc=100, scale=1)
result, eps = scipy.integrate.quad(centered_at_100.pdf, -1000, 1000)
print("result", result, "error", eps)
Output:
result 0.0 error 0.0
Why does this happen? quad() is evaluating the integrand at various points along the region of interest. Using these points, it compares a high-degree approximation of the integral of this area to a low degree approximation of the integral of this area.
If those two are the same, then it concludes that nothing interesting is happening in this region of the function. If the two approximations are different, then it subdivides the region further, and approximates each region separately. This allows it to spend lots of time exploring complex regions of a function, and little time where the function is not doing anything interesting. If none of the points it tries are near x=100, then it will incorrectly conclude that the function is constant in this region.
In order to fix this for this example, you need to tell quad() that something interesting is happening in the function at x=100, using the points argument.
import scipy
import numpy as np
centered_at_100 = scipy.stats.norm(loc=100, scale=1)
result, eps = scipy.integrate.quad(centered_at_100.pdf, -1000, 1000, points=[100])
print("result", result, "error", eps)
Output:
result 1.0 error 2.6169364753623105e-11
Returning to your example, we need to tell quad that something interesting is happening around 1e-8. Also, as mentioned in the comments, the default value of epsabs is much too large for this problem.
def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
r_max = 1 #np.inf #UPPER INTEGRATION BOUND
points = np.array([1e-9, 1e-8, 1e-7])
int1 = quad( integral1, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
int2 = quad( integral2, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
return lum_int
The previous example is integrating from 0 to 1, but what if we want to integrate from 0 to infinity? Here we encounter a problem. The points argument is incompatible with integrating across infinite bounds, because the library that SciPy uses does not support it.
We could solve this in two ways.
def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
r_max = 1 #np.inf #UPPER INTEGRATION BOUND
points = np.array([1e-9, 1e-8, 1e-7])
int1_finite = quad( integral1, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
int1_infinite = quad( integral1, r_max, np.inf, args=(W_0,R_0,t), epsabs=0, epsrel=1.49e-8)[0]
int1 = int1_finite + int1_infinite
int2_finite = quad( integral2, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
int2_infinite = quad( integral2, r_max, np.inf, args=(W_0,R_0,t), epsabs=0, epsrel=1.49e-8)[0]
int2 = int2_finite + int2_infinite
lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
return lum_int
This change permits it to quite accurately integrate the t=0 value over the range [0, inf].
Theoretical value for t=0: 1.628601631620949e-39
Calculated value for t=0: 1.628601631620949e-39
As someone using
scipy.integrate.quadfor the first time, I am asking myself [...] whether or not the vectorizationnp.vectorize()of the function is the (or a) problem.
The use of np.vectorize() here looks fine to me.
Lastly, how can we identify "interesting" points in the function? As a rule of thumb, maximums, minimums, and discontinuities are all interesting. I tried to come up with a program that would automatically find the nontrivial roots of the derivatives of your two integrands.
The program that I wrote basically does the following:
It all ended up being more complicated than I thought it would. For that reason, I decided to start this answer with the simple solution. However, it would feel like a waste if I didn't show it to you. :) So, here's a program that can automatically identify interesting integration points.
#### Script for numerically integrating and plotting the Thomas-Hopfield-Model
#-------------- IMPORTING LIBRARIES --------------
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import quad
import sympy
import mpmath
import scipy
import math
from collections import defaultdict
mpmath.mp.dps = 50
#-------------- DEFINING MODEL FUNCTION ----------------
# The physical model has three parameters (W_0, R_0, n),
# in order to fit it onto experimental data later, two constants (A, B) have been added.
# it contains two integrals over the variable r from 0 to infinity, but due to overflow issues have here been replaced by integrals from 0 to a set value r_max.
# The overall model is a function of time t.
def W(r,W_0,R_0): # exponential decay function for the recombination rate of electron-hole pair
return W_0 * np.exp( (-2 * r) / R_0 )
def integral1(r,W_0,R_0,t): # First integral (over r) for the Thomas-Hopfield model
return W(r,W_0,R_0) * np.exp( -W(r,W_0,R_0) * t ) * r**2
def integral2(r,W_0,R_0,t): # second integral (over r) for the Thomas-Hopfield model (in the exponent)
return ( np.exp( -W(r,W_0,R_0) *t ) - 1 ) *r**2
W_0_sym = sympy.Symbol('W_0')
R_0_sym = sympy.Symbol('R_0')
t_sym = sympy.Symbol('t')
r_sym = sympy.Symbol('r')
W_0 = 1e6
R_0 = 1.2e-9
def W_analytic():
return W_0_sym * sympy.exp( (-2 * r_sym) / R_0_sym )
def integral1_analytic():
return W_analytic() * sympy.exp( -W_analytic() * t_sym ) * r_sym ** 2
def integral2_analytic():
return ( sympy.exp( -W_analytic() * t_sym ) - 1 ) * r_sym ** 2
def integral_analytic_derivative(integral, W_0, R_0, n=1):
integral = integral.subs(W_0_sym, W_0)
integral = integral.subs(R_0_sym, R_0)
# integral = integral.subs(t_sym, )
derivative = sympy.diff(integral, r_sym, n)
derivative = sympy.simplify(derivative)
return sympy.lambdify((r_sym, t_sym), derivative, modules=['mpmath'])
def zoom(function, cond, args, start, mult, maxiter):
"""Find the first pair of points where
func(x) does not satisfy the condition but
func(x * mult) does."""
x = start
if cond(function(x, *args)):
raise Exception("invalid start point")
for i in range(maxiter):
last_x = x
x *= mult
if cond(function(x, *args)):
return last_x, x
raise Exception(f"Unable to find starting area for {function} with {args}")
def prevent_underflow(func):
def inner(x, *args):
ret_mpf = func(x, *args)
ret_float = float(ret_mpf)
if ret_float == 0 and ret_mpf != 0:
# Prevent underflow
ret_float = float(mpmath.sign(ret_mpf)) * math.ulp(0)
return ret_float
return inner
def find_function_root_after_point(function, derivative, t, lower_bound):
# t = 0 causes problems
t = np.clip(t, 1e-30, np.inf)
# Look for point with opposite sign
if derivative(lower_bound, t) > 0:
cond = lambda x: x <= 0
else:
cond = lambda x: x > 0
lower_bound, upper_bound = zoom(derivative, cond, (t,), lower_bound, 5, 100)
bound_range = upper_bound - lower_bound
xtol = bound_range * 1e-3
result = scipy.optimize.bisect(
prevent_underflow(derivative),
lower_bound,
upper_bound,
args=(t,),
xtol=xtol
)
return result
def find_tails(function, mass, args, function_max_x):
function_max = function(function_max_x, *args)
target_log = float(mpmath.log(mpmath.fabs(function_max))) + math.log(mass)
def objective(x):
fval = function(x[0] * function_max_x, *args)
function_in_log = float(mpmath.log(mpmath.fabs(fval)))
error = function_in_log - target_log
return error ** 2
options = dict(initial_tr_radius=0.1, f_target=1e-3)
left_bound_res = scipy.optimize.minimize(objective, 1, bounds=[(1e-3, 1)], options=options, method='COBYQA')
assert left_bound_res.success
right_bound_res = scipy.optimize.minimize(objective, 1, bounds=[(1, np.inf)], options=options, method='COBYQA')
assert right_bound_res.success
return left_bound_res.x[0], right_bound_res.x[0]
integral1_0th_derivative = integral_analytic_derivative(integral1_analytic(), W_0, R_0, n=0)
integral1_1st_derivative = integral_analytic_derivative(integral1_analytic(), W_0, R_0, n=1)
integral2_0th_derivative = integral_analytic_derivative(integral2_analytic(), W_0, R_0, n=0)
integral2_1st_derivative = integral_analytic_derivative(integral2_analytic(), W_0, R_0, n=1)
last_integral1_root = None
last_integral2_root = None
def find_interesting_integral1_points(t):
result = find_function_root_after_point(
function=integral1_0th_derivative,
derivative=integral1_1st_derivative,
t=t,
lower_bound=1e-15,
)
# Multipliers derived from find_tail()
return [result * 0.01, result * 0.9, result, result * 1.3, result * 20]
def find_interesting_integral2_points(t):
result = find_function_root_after_point(
function=integral2_0th_derivative,
derivative=integral2_1st_derivative,
t=t,
lower_bound=1e-15,
)
# Multipliers derived from find_tail()
return [result * 0.01, result * 0.03, result, result * 1.4, result * 20]
def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
r_max = 1 #np.inf #UPPER INTEGRATION BOUND
points1 = find_interesting_integral1_points(t)
points2 = find_interesting_integral2_points(t)
int1, abserr1 = quad( integral1, 0, r_max, args=(W_0,R_0,t), points=points1, epsabs=0, epsrel=1.49e-8)
int2, abserr2 = quad( integral2, 0, r_max, args=(W_0,R_0,t), points=points2, epsabs=0, epsrel=1.49e-8)
lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
return lum_int
thomas_hopfield_v = np.vectorize(thomas_hopfield) # this has been added, as trying to plot it without vectorization resulted in the error: "TypeError: only size-1 arrays can be converted to Python scalars"
#--------------- EXAMPLE VALUES FOR PARAMETERS ---------------:
n = 3e-19
A = 1
B = 0
#---------------------------------------------------------------
#--------- SIDE QUEST: theoretical value for t=0 --------------
theoretical_value = n*np.pi*W_0*R_0**3 # literature theoretical value for t=0
thomas_hopfield_t0 = thomas_hopfield(0,W_0,R_0,n,A,B) # calculate the theoretical value for t=0
print('Theoretical value for t=0:', theoretical_value)
print('Calculated value for t=0:', thomas_hopfield_t0) # print the calculated value for t=0
#-----------------------------------------------------------------
#----------------------------------------------------------------
#--------------- GRAPHICAL REPRESENTATION-------------
t = np.linspace(0,100000000, 1000) # create time value array
y_model = thomas_hopfield_v(t,W_0,R_0,n,A,B) # apply the model on the time t
plt.plot(t, y_model, color='r',linestyle='-',linewidth=0.7, alpha=1, label='Thomas Hopfield Model',zorder=1)
plt.yscale('log')
plt.xscale('log')
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With