I'm currently attempting to use SymPy to generate and numerically evaluate a function and its gradient. For simplicity, I'll use the following function as an example (keeping in mind that the real function is much lengthier):
import sympy as sp
def g(x):
return sp.cos(x) + sp.cos(x)**2 + sp.cos(x)**3
It's easy enough to numerically evaluate this function and its derivative:
import numpy as np
g_expr = sp.lambdify(x,g(x),modules='numpy')
dg_expr = sp.lambdify(x,sp.diff(g(x)),modules='numpy')
print g_expr(np.linspace(0,1,50))
print dg_expr(np.linspace(0,1,50))
For my real function, however, lambdify is slow, both in terms of generating the numerical function and in its evaluation. As many elements in my function are similar, I'd like to use common subexpression elimination (cse) within lambdify to speed this process up. I know that SymPy has a built-in function to perform cse,
>>> print sp.cse(g(x))
([(x0, cos(x))], [x0**3 + x0**2 + x0])
but don't know what syntax to use in order to utilize this result in my lambdify function (where I'd still like to use x as my input argument):
>>> g_expr_fast = sp.lambdify(x,sp.cse(g(x)),modules='numpy')
>>> print g_expr_fast(np.linspace(0,1,50))
Traceback (most recent call last):
File "test3.py", line 34, in <module>
print g_expr1(nx1)
File "<string>", line 1, in <lambda>
NameError: global name 'x0' is not defined
Any help on how to use cse in lambdify would be appreciated. Or, if there are better ways to speed up my gradient calculations, I'd appreciate hearing those as well.
In case it's relevant, I'm using Python 2.7.3 and SymPy 0.7.6.
The speed of the calculations can be increased:
I assume that this is "calculate function in sympy once and use it many times in different project later" type of case. Therefore, there is some manual copy-pasting and creating of files included. However it is possible to automate the creation of the new files with the functions, and also the compilation step, but I leave that out from this now.
I had a similar problem and I did some benchmarking on different methods. The function I've used is quite long (len(str(expr)) = 45857
) and cse(expr)
decomposes it into 72 subexpressions. It is too long to copy-paste here but here are the steps to make up to 100x-1000x speed increase to functions created with sympy.
A) Evaluating single float
Time to evaluate the function with one float value for each parameter. Using timeit myfunc(*params)
.
modules="numpy"
: 277µsstr(expr)
to function definition: 275µs (no difference)cse
: 8.2 µs (33x improvement)cse(optimizations="basic")
: 7.6µs (36x improvement)func_numba_f()
: 0.25µs (1090x improvement)autowrap
: 0.47 µs (589x improvement)B) Evaluating np.array of 1000 floats
str(expr)
to function definition: 15100 µs | 15.1µs per valuecse
: 493 µs | 0.49µs per value (31x improvement)cse(optimizations="basic")
: 413µs | 0.41µs per value (37x improvement)func_numba_arr()
: 114µs | 0.11µs per value (132x improvement)str(expr)
cse
repl, redu = cse(K)
for variable, expr in repl:
print(f"{variable} = {expr}")
print(redu[0])
.cse(optimizations="basic")
optimizations="basic"
src_mymodule.py
with codefrom numba.pycc import CC
cc = CC("my_numba_module")
@cc.export("func_numba_f", "f8(f8, f8, f8, f8, f8)")
@cc.export("func_numba_arr", "f8[:](f8[:],f8[:],f8[:],f8[:],f8[:])")
def myfunc(x1, x2, x3, x4, x5):
# your function definition here
return value
if __name__ == "__main__":
cc.compile()
func_numba_f()
there are five float valued input variables and one float valued output variable. The f8
means float.func_numba_arr()
is the version that handle np.arrays with dtype="float64"
or dtype="float32"
, depending what you use to compile it.python src_mymodule.py
. This will create my_numba_module.cp38-win_amd64.pyd
or similar. It can be only used with the same python version and bitness as in the filename.from my_numba_module import func_numba_f, func_numba_arr
out = func_numba_f(4,3,2,1,100)
# or:
args = [np.array([x]*N, dtype='float64') for x in (4,3,2,1,100)]
out_arr = func_numba_arr(*args)
autowrap
from sympy.utilities.autowrap import autowrap
func = autowrap(expr, backend='cython')
temp_dir
argument, it saves all the source files (.c, .h, .pyx) and a .pyd (win) / .so (unix) file that can be used to import the function later on with (assuming the temp_dir
is in sys.path
):from wrapper_module_1 import autofunc_c
func = np.vectorize(func)
So this might not be the most optimal way to do it, but for my small example it works.
The idea of following code is lambdifying each common subexpression and generate a new function with possibly all arguments. I added a few additional sin and cos terms to add possible dependencies from previous subexpressions.
import sympy as sp
import sympy.abc
import numpy as np
import matplotlib.pyplot as pl
def g(x):
return sp.cos(x) + sp.cos(x)**2 + sp.cos(x)**3 + sp.sin(sp.cos(x)+sp.sin(x))**4 + sp.sin(x) - sp.cos(3*x) + sp.sin(x)**2
repl, redu=sp.cse(g(sp.abc.x))
funs = []
syms = [sp.abc.x]
for i, v in enumerate(repl):
funs.append(sp.lambdify(syms,v[1],modules='numpy'))
syms.append(v[0])
glam = sp.lambdify(syms,redu[0],modules='numpy')
x = np.linspace(-1,5,10)
xs=[x]
for f in funs:
xs.append(f(*xs))
print glam(*xs)
glamlam = sp.lambdify(sp.abc.x,g(sp.abc.x),modules='numpy')
print glamlam(x)
print np.allclose(glamlam(x),glam(*xs))
repl contains:
[(x0, cos(x)), (x1, sin(x)), (x2, x0 + x1)]
and redu contains
[x0**3 + x0**2 + x1**2 + x2 + sin(x2)**4 - cos(3*x)]
So funs
contains all subexpression lambdified and the list xs
contains each subexpression evaluated, such one can feed glam
properly in the end. xs
grows with each subexpression and might turn out to be a bottle neck in the end.
You can do the same approach on the expression of sp.cse(sp.diff(g(sp.abc.x)))
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With