I'm working with Python/numpy/scipy to write a small ray tracer. Surfaces are modelled as two-dimensional functions giving a height above a normal plane. I reduced the problem of finding the point of intersection between ray and surface to finding the root of a function with one variable. The functions are continuous and continuously differentiable.
Is there a way to do this more efficiently than simply looping over all the functions, using scipy root finders (and maybe using multiple processes)?
Edit: The functions are the difference between a linear function representing the ray and the surface function, constrained to a plane of intersection.
The secant method is a simplification of the Newton method, which uses the derivitive of the function to better predict the root of the function.
According to algebra, a root is the zero of the function, that is, where the function f(x) is zero. There are three ways to solve the equations, namely analytically, graphically and numerically.
Sometime in the past few years, scipy.optimize.newton
gained vectorization support. Using the example from the other answer would now look like:
import numpy as np
from scipy import optimize
def F(x, a, b):
return np.power(x, a+1.0) - b
N = 1000000
a = np.random.rand(N)
b = np.random.rand(N)
optimize.newton(F, np.zeros(N), args=(a, b))
This runs just as fast as the the vectorized bisection method in the other answer.
The following example shows calculating the roots for 1 million copies of the function x**(a+1) - b (all with different a and b) in parallel using the bisection method. Takes about ~12 seconds here.
import numpy
def F(x, a, b):
return numpy.power(x, a+1.0) - b
N = 1000000
a = numpy.random.rand(N)
b = numpy.random.rand(N)
x0 = numpy.zeros(N)
x1 = numpy.ones(N) * 1000.0
max_step = 100
for step in range(max_step):
x_mid = (x0 + x1)/2.0
F0 = F(x0, a, b)
F1 = F(x1, a, b)
F_mid = F(x_mid, a, b)
x0 = numpy.where( numpy.sign(F_mid) == numpy.sign(F0), x_mid, x0 )
x1 = numpy.where( numpy.sign(F_mid) == numpy.sign(F1), x_mid, x1 )
error_max = numpy.amax(numpy.abs(x1 - x0))
print "step=%d error max=%f" % (step, error_max)
if error_max < 1e-6: break
The basic idea is to simply run all the usual steps of a root finder in parallel on a vector of variables, using a function that can be evaluated on a vector of variables and equivalent vector(s) of parameters that define the individual component functions. Conditionals are replaced with a combination of masks and numpy.where(). This can continue until all roots have been found to the required precision, or alternately until enough roots have been found that it is worth to remove them from the problem and continue with a smaller problem that excludes those roots.
The functions I chose to solve are arbitrary, but it helps if the functions are well-behaved; in this case all functions in the family are monotonic and have exactly one positive root. Additionally, for the bisection method we need guesses for the variable that give different signs of the function, and those happen to be quite easy to come up with here as well (the initial values of x0 and x1).
The above code uses perhaps the simplest root finder (bisection), but the same technique could be easily applied to Newton-Raphson, Ridder's, etc. The fewer conditionals there are in a root finding method, the better suited it is to this. However, you will have to reimplement any algorithm you want, there is no way to use an existing library root finder function directly.
The above code snippet is written with clarity in mind, not speed. Avoiding the repetition of some calculations, in particular evaluating the function only once per iteration instead of 3 times, speeds this up to 9 seconds, as follows:
...
F0 = F(x0, a, b)
F1 = F(x1, a, b)
max_step = 100
for step in range(max_step):
x_mid = (x0 + x1)/2.0
F_mid = F(x_mid, a, b)
mask0 = numpy.sign(F_mid) == numpy.sign(F0)
mask1 = numpy.sign(F_mid) == numpy.sign(F1)
x0 = numpy.where( mask0, x_mid, x0 )
x1 = numpy.where( mask1, x_mid, x1 )
F0 = numpy.where( mask0, F_mid, F0 )
F1 = numpy.where( mask1, F_mid, F1 )
...
For comparison, using scipy.bisect() to find one root at a time takes ~94 seconds:
for i in range(N):
x_root = scipy.optimize.bisect(lambda x: F(x, a[i], b[i]), x0[i], x1[i], xtol=1e-6)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With