What is the most appropriate way to express the following in SymPy:
A sum over samples 'x[i]' with 'i' going from concrete 0 to symbolic 'N'. 'x[i]' itself shall be symbolic, i.e. always appear as variable.
The goal is to use these expressions in a system of linear equations.
Given a set of samples (x[i], y[i]) which are supposed to lie on a line given by 'y = m*x + a'. That is, the estimated line is determined by 'm' and 'a'. The error between the samples and the estimated line may be given by
error(m, a) = sum((m * x[i] + a - y[i]) ** 2, start_i=0, end_i=N)
Now, searching for the zero transitions in the derivatives 'd/dm error(m,a)' and 'd/da error(m,a)' delivers the minimal distance. How could I find the solution with sympy?
Given your later question, I assume you already figured most of it, but for clarity sake, samples are considered as function (makes sense, given sets are actually functions that cover the domain of the set [mostly over part of the integers]), so the notation is like x(i)
, and summation can be achieved with the summation
function or Sum
constructor (the first one is better, since it will expand automatically constant addends, like summation(x, (i, 0, n))
).
>>> from sympy import *
>>> m, a, x, y, i, n = symbols('m a x y i n')
>>> err = summation((m * x(i) + a - y(i)) ** 2, (i, 0, n))
>>> pprint(err)
n
___
╲
╲ 2
╱ (a + m⋅x(i) - y(i))
╱
‾‾‾
i = 0
After you provide the sum function the addend expression and the (index, lower bound, upper bound)
, you can move on to play with the sum:
>>> diff(err, m)
Sum(2*(a + m*x(i) - y(i))*x(i), (i, 0, n))
>>> diff(err, a)
Sum(2*a + 2*m*x(i) - 2*y(i), (i, 0, n))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With