Integration limits when integrating to infinity using quad and vectorize

Question

I have a rather complicated physics-model (Thomas Hopfield model from https://www.researchgate.net/publication/243703523_Kinetics_of_Radiative_Recombination_at_Randomly_Distributed_Donors_and_Acceptors).

For the sake of later fitting it to experimental data, I first simply want to graphically plot it and am already running into difficulties. The model reacts very sensible to changes in the integration limit and I am not sure, whether the vectorization-function creates problems with the integration.

The model consists of two analytically not-solvable integrals from 0 to infinity, and I want to plot them using the scipy.integrate.quad function.

(example code below)

As expected, integrating to infinity does not work, so I integrated up to a set value (r_max). However, the resulting function is very sensible to changes in the r_max. When increasing it (e.g. from 250e-9 to 250e-8), the resulting function gets stretched. That poses a problem, as it seems unjustified to then take the highest computable r_max and assume it similar to infinity. (What is even weirder, is that there is a theoretical value for t=0, which is most closely produced using r_max = 250e-9 and not using higher values "closer to infinity".)

#### Script for numerically integrating and plotting the Thomas-Hopfield-Model


#-------------- IMPORTING LIBRARIES --------------
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import quad


#-------------- DEFINING MODEL FUNCTION ----------------
# The physical model has three parameters (W_0, R_0, n), 
# in order to fit it onto experimental data later, two constants (A, B) have been added. 
# it contains two integrals over the variable r from 0 to infinity, but due to overflow issues have here been replaced by integrals from 0 to a set value r_max. 
# The overall model is a function of time t.

def W(r,W_0,R_0):               # exponential decay function for the recombination rate of electron-hole pair
    return W_0 * np.exp( (-2 * r) / R_0 )

def integral1(r,W_0,R_0,t):     # First integral (over r) for the Thomas-Hopfield model
    return W(r,W_0,R_0) * np.exp( -W(r,W_0,R_0) * t ) * r**2

def integral2(r,W_0,R_0,t):     # second integral (over r) for the Thomas-Hopfield model (in the exponent)
    return ( np.exp( -W(r,W_0,R_0) *t ) - 1 ) *r**2

def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
    r_max = 250e-7              #np.inf  #UPPER INTEGRATION BOUND
    int1 = quad( integral1, 0, r_max, args=(W_0,R_0,t) )[0]
    int2 = quad( integral2, 0, r_max, args=(W_0,R_0,t) )[0]
    lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
    return lum_int

thomas_hopfield_v = np.vectorize(thomas_hopfield) # this has been added, as trying to plot it without vectorization resulted in the error: "TypeError: only size-1 arrays can be converted to Python scalars"


#--------------- EXAMPLE VALUES FOR PARAMETERS ---------------:
W_0 = 1e6
R_0 = 1.2e-9
n = 3e-19
A = 1
B = 0


#---------------------------------------------------------------
#--------- SIDE QUEST: theoretical value for t=0 --------------

theoretical_value = n*np.pi*W_0*R_0**3                  # literature theoretical value for t=0
thomas_hopfield_t0 = thomas_hopfield(0,W_0,R_0,n,A,B)   # calculate the theoretical value for t=0
print('Theoretical value for t=0:', theoretical_value)
print('Calculated value for t=0:', thomas_hopfield_t0)  # print the calculated value for t=0

#-----------------------------------------------------------------
#----------------------------------------------------------------

#--------------- GRAPHICAL REPRESENTATION-------------

t = np.linspace(0,100000000, 100000)            # create time value array 
y_model = thomas_hopfield_v(t,W_0,R_0,n,A,B)     # apply the model on the time t
plt.plot(t, y_model, color='r',linestyle='-',linewidth=0.7, alpha=1, label='Thomas Hopfield Model',zorder=1)
plt.yscale('log')
plt.xscale('log')
plt.show()

As someone using scipy.integrate.quad for the first time, I am asking myself how quad is computing the result (in which order it takes in the array, computes the integrals and returns the values) and whether or not the vectorization np.vectorize() of the function is the (or a) problem. I had to include np.vectorize(), as I would otherwise receive "TypeError: only size-1 arrays can be converted to Python scalars", but it makes it almost impossible for me to follow and comprehend the individual steps of the programme.

Does someone have experience with that? It would already help me to know, whether the vectorisation creates the problems and / or how I could plot the function without using it. Thank you very much in advance!

Nick ODell · Accepted Answer

Dividing integrands up for better accuracy

As someone using scipy.integrate.quad for the first time, I am asking myself how quad is computing the result (in which order it takes in the array, computes the integrals and returns the values)

Imagine that you wanted to integrate the normal distribution's probability density function from -1000 to 1000. This works fine.

import scipy
import numpy as np

centered_at_zero = scipy.stats.norm(loc=0, scale=1)
result, eps = scipy.integrate.quad(centered_at_zero.pdf, -1000, 1000)
print("result", result, "error", eps)

Output:

result 1.0 error 1.820767988928941e-12

Now, let's try something similar. Let's try integrating the same distribution, but have the distribution centered at 100. Theoretically, this shouldn't change anything.

import scipy
import numpy as np

centered_at_100 = scipy.stats.norm(loc=100, scale=1)
result, eps = scipy.integrate.quad(centered_at_100.pdf, -1000, 1000)
print("result", result, "error", eps)

Output:

result 0.0 error 0.0

Why does this happen? quad() is evaluating the integrand at various points along the region of interest. Using these points, it compares a high-degree approximation of the integral of this area to a low degree approximation of the integral of this area.

If those two are the same, then it concludes that nothing interesting is happening in this region of the function. If the two approximations are different, then it subdivides the region further, and approximates each region separately. This allows it to spend lots of time exploring complex regions of a function, and little time where the function is not doing anything interesting. If none of the points it tries are near x=100, then it will incorrectly conclude that the function is constant in this region.

In order to fix this for this example, you need to tell quad() that something interesting is happening in the function at x=100, using the points argument.

import scipy
import numpy as np

centered_at_100 = scipy.stats.norm(loc=100, scale=1)
result, eps = scipy.integrate.quad(centered_at_100.pdf, -1000, 1000, points=[100])
print("result", result, "error", eps)

Output:

result 1.0 error 2.6169364753623105e-11

Returning to your example, we need to tell quad that something interesting is happening around 1e-8. Also, as mentioned in the comments, the default value of epsabs is much too large for this problem.

def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
    r_max = 1              #np.inf  #UPPER INTEGRATION BOUND
    points = np.array([1e-9, 1e-8, 1e-7])
    int1 = quad( integral1, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
    int2 = quad( integral2, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
    lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
    return lum_int

The previous example is integrating from 0 to 1, but what if we want to integrate from 0 to infinity? Here we encounter a problem. The points argument is incompatible with integrating across infinite bounds, because the library that SciPy uses does not support it.

We could solve this in two ways.

The first option is that we could ignore the problem. The function is essentially 0 at r = 1.
The second approach we could take is to split the integration up into two pieces - one that integrates over [0, 1], and the other which integrates over [1, inf).

def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
    r_max = 1              #np.inf  #UPPER INTEGRATION BOUND
    points = np.array([1e-9, 1e-8, 1e-7])
    int1_finite   = quad( integral1, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
    int1_infinite = quad( integral1, r_max, np.inf, args=(W_0,R_0,t), epsabs=0, epsrel=1.49e-8)[0]
    int1 = int1_finite + int1_infinite
    int2_finite = quad( integral2, 0, r_max, args=(W_0,R_0,t), points=points, epsabs=0, epsrel=1.49e-8)[0]
    int2_infinite = quad( integral2, r_max, np.inf, args=(W_0,R_0,t), epsabs=0, epsrel=1.49e-8)[0]
    int2 = int2_finite + int2_infinite
    lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
    return lum_int

This change permits it to quite accurately integrate the t=0 value over the range [0, inf].

Theoretical value for t=0: 1.628601631620949e-39
Calculated value for t=0: 1.628601631620949e-39

As someone using scipy.integrate.quad for the first time, I am asking myself [...] whether or not the vectorization np.vectorize() of the function is the (or a) problem.

The use of np.vectorize() here looks fine to me.

Automatically identifying "interesting" points

Lastly, how can we identify "interesting" points in the function? As a rule of thumb, maximums, minimums, and discontinuities are all interesting. I tried to come up with a program that would automatically find the nontrivial roots of the derivatives of your two integrands.

The program that I wrote basically does the following:

Analytically differentiate the function.
In this derivative, search for a root within this derivative by first exponentially expanding the search area, then using bisection to find the zero crossing. The bounds are chosen to avoid the trivial root at 0.
Use the maximum to provide points to segment the integration area.

It all ended up being more complicated than I thought it would. For that reason, I decided to start this answer with the simple solution. However, it would feel like a waste if I didn't show it to you. :) So, here's a program that can automatically identify interesting integration points.

#### Script for numerically integrating and plotting the Thomas-Hopfield-Model


#-------------- IMPORTING LIBRARIES --------------
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import quad
import sympy
import mpmath
import scipy
import math
from collections import defaultdict
mpmath.mp.dps = 50


#-------------- DEFINING MODEL FUNCTION ----------------
# The physical model has three parameters (W_0, R_0, n), 
# in order to fit it onto experimental data later, two constants (A, B) have been added. 
# it contains two integrals over the variable r from 0 to infinity, but due to overflow issues have here been replaced by integrals from 0 to a set value r_max. 
# The overall model is a function of time t.

def W(r,W_0,R_0):               # exponential decay function for the recombination rate of electron-hole pair
    return W_0 * np.exp( (-2 * r) / R_0 )

def integral1(r,W_0,R_0,t):     # First integral (over r) for the Thomas-Hopfield model
    return W(r,W_0,R_0) * np.exp( -W(r,W_0,R_0) * t ) * r**2

def integral2(r,W_0,R_0,t):     # second integral (over r) for the Thomas-Hopfield model (in the exponent)
    return ( np.exp( -W(r,W_0,R_0) *t ) - 1 ) *r**2



W_0_sym = sympy.Symbol('W_0')
R_0_sym = sympy.Symbol('R_0')
t_sym = sympy.Symbol('t')
r_sym = sympy.Symbol('r')
W_0 = 1e6
R_0 = 1.2e-9


def W_analytic():
    return W_0_sym * sympy.exp( (-2 * r_sym) / R_0_sym )

def integral1_analytic():
    return W_analytic() * sympy.exp( -W_analytic() * t_sym ) * r_sym ** 2

def integral2_analytic():
    return ( sympy.exp( -W_analytic() * t_sym ) - 1 ) * r_sym ** 2


def integral_analytic_derivative(integral, W_0, R_0, n=1):
    integral = integral.subs(W_0_sym, W_0)
    integral = integral.subs(R_0_sym, R_0)
#     integral = integral.subs(t_sym, )
    derivative = sympy.diff(integral, r_sym, n)
    derivative = sympy.simplify(derivative)
    return sympy.lambdify((r_sym, t_sym), derivative, modules=['mpmath'])


def zoom(function, cond, args, start, mult, maxiter):
    """Find the first pair of points where
    func(x) does not satisfy the condition but
    func(x * mult) does."""
    x = start
    if cond(function(x, *args)):
        raise Exception("invalid start point")

    for i in range(maxiter):
        last_x = x
        x *= mult
        if cond(function(x, *args)):
            return last_x, x
    raise Exception(f"Unable to find starting area for {function} with {args}")


def prevent_underflow(func):
    def inner(x, *args):
        ret_mpf = func(x, *args)
        ret_float = float(ret_mpf)
        if ret_float == 0 and ret_mpf != 0:
            # Prevent underflow
            ret_float = float(mpmath.sign(ret_mpf)) * math.ulp(0)
        return ret_float
    return inner


def find_function_root_after_point(function, derivative, t, lower_bound):
    # t = 0 causes problems
    t = np.clip(t, 1e-30, np.inf)
    # Look for point with opposite sign
    if derivative(lower_bound, t) > 0:
        cond = lambda x: x <= 0
    else:
        cond = lambda x: x > 0
    lower_bound, upper_bound = zoom(derivative, cond, (t,), lower_bound, 5, 100)
    bound_range = upper_bound - lower_bound
    xtol = bound_range * 1e-3
    result = scipy.optimize.bisect(
        prevent_underflow(derivative),
        lower_bound,
        upper_bound,
        args=(t,),
        xtol=xtol
    )
    return result


def find_tails(function, mass, args, function_max_x):
    function_max = function(function_max_x, *args)
    target_log = float(mpmath.log(mpmath.fabs(function_max))) + math.log(mass)
    def objective(x):
        fval = function(x[0] * function_max_x, *args)
        function_in_log = float(mpmath.log(mpmath.fabs(fval)))
        error = function_in_log - target_log
        return error ** 2
    options = dict(initial_tr_radius=0.1, f_target=1e-3)
    left_bound_res = scipy.optimize.minimize(objective, 1, bounds=[(1e-3, 1)], options=options, method='COBYQA')
    assert left_bound_res.success
    right_bound_res = scipy.optimize.minimize(objective, 1, bounds=[(1, np.inf)], options=options, method='COBYQA')
    assert right_bound_res.success
    return left_bound_res.x[0], right_bound_res.x[0]


integral1_0th_derivative = integral_analytic_derivative(integral1_analytic(), W_0, R_0, n=0)
integral1_1st_derivative = integral_analytic_derivative(integral1_analytic(), W_0, R_0, n=1)

integral2_0th_derivative = integral_analytic_derivative(integral2_analytic(), W_0, R_0, n=0)
integral2_1st_derivative = integral_analytic_derivative(integral2_analytic(), W_0, R_0, n=1)


last_integral1_root = None
last_integral2_root = None


def find_interesting_integral1_points(t):
    result = find_function_root_after_point(
        function=integral1_0th_derivative,
        derivative=integral1_1st_derivative,
        t=t,
        lower_bound=1e-15,
    )
    # Multipliers derived from find_tail()
    return [result * 0.01, result * 0.9, result, result * 1.3, result * 20]


def find_interesting_integral2_points(t):
    result = find_function_root_after_point(
        function=integral2_0th_derivative,
        derivative=integral2_1st_derivative,
        t=t,
        lower_bound=1e-15,
    )
    # Multipliers derived from find_tail()
    return [result * 0.01, result * 0.03, result, result * 1.4, result * 20]


def thomas_hopfield(t,W_0,R_0,n,A,B): # comined function for the Thomas-Hopfield model
    r_max = 1              #np.inf  #UPPER INTEGRATION BOUND
    
    points1 = find_interesting_integral1_points(t)
    points2 = find_interesting_integral2_points(t)

    int1, abserr1 = quad( integral1, 0, r_max, args=(W_0,R_0,t), points=points1, epsabs=0, epsrel=1.49e-8)
    int2, abserr2 = quad( integral2, 0, r_max, args=(W_0,R_0,t), points=points2, epsabs=0, epsrel=1.49e-8)
    lum_int = A*( (4 * np.pi * n * int1) * np.exp(4 * np.pi * n * int2) ) + B
    return lum_int


thomas_hopfield_v = np.vectorize(thomas_hopfield) # this has been added, as trying to plot it without vectorization resulted in the error: "TypeError: only size-1 arrays can be converted to Python scalars"


#--------------- EXAMPLE VALUES FOR PARAMETERS ---------------:
n = 3e-19
A = 1
B = 0


#---------------------------------------------------------------
#--------- SIDE QUEST: theoretical value for t=0 --------------

theoretical_value = n*np.pi*W_0*R_0**3                  # literature theoretical value for t=0
thomas_hopfield_t0 = thomas_hopfield(0,W_0,R_0,n,A,B)   # calculate the theoretical value for t=0
print('Theoretical value for t=0:', theoretical_value)
print('Calculated value for t=0:', thomas_hopfield_t0)  # print the calculated value for t=0

#-----------------------------------------------------------------
#----------------------------------------------------------------

#--------------- GRAPHICAL REPRESENTATION-------------

t = np.linspace(0,100000000, 1000)               # create time value array 
y_model = thomas_hopfield_v(t,W_0,R_0,n,A,B)     # apply the model on the time t
plt.plot(t, y_model, color='r',linestyle='-',linewidth=0.7, alpha=1, label='Thomas Hopfield Model',zorder=1)
plt.yscale('log')
plt.xscale('log')
plt.show()

Integration limits when integrating to infinity using quad and vectorize

Tags:

python

matplotlib

physics

scipy

numerical-integration

Thomas

1 Answers

Dividing integrands up for better accuracy

Automatically identifying "interesting" points

Nick ODell

Recent Activity

Donate For Us

Integration limits when integrating to infinity using quad and vectorize

Tags:

python

matplotlib

physics

scipy

numerical-integration

Thomas

1 Answers

Dividing integrands up for better accuracy

Automatically identifying "interesting" points

Nick ODell

Related questions

Recent Activity

Donate For Us