I have a dataset, and I'd like to find a mixed gaussian model by least square error method.
The code is like this:
from sklearn.neighbors import KernelDensity
kde = KernelDensity().fit(sample)
def gaussian_2d(x,y,meanx,meany,sigx,sigy,rho):
# rho <= 1
part1 = 1/(2*np.pi*sigx*sigy*sqrt(1-0.5**2))
part2 = -1/2*(1-rho**2)
part3 = (((x-meanx)/sigx)**2-2*rho*(x-meanx)*(y-meany)/(sigx*sigy)+((y-meany)/sigy)**2)
return part1*exp(part2*part3)
def square_error(f1,f2, u1,v1,sigu1,sigv1,rho1, u2,v2,sigu2,sigv2,rho2, u3,v3,sigu3,sigv3,rho3):
# 1. Generate Mixed Gaussian Model
def gaussian1(x,y):
return gaussian_2d(x,y,u1,v1,sigu1,sigv1,rho1)
def gaussian2(x,y):
return gaussian_2d(x,y,u2,v2,sigu2,sigv2,rho2)
def gaussian3(x,y):
return gaussian_2d(x,y,u3,v3,sigu3,sigv3,rho3)
mixed_model = f1*gaussian1(x,y)+f2*gaussian2(x,y)+(1-f1-f2)*gaussian3(x,y)
# 2. Calculate the sum of square error
sum_error = 0
for point in sample:
error = (exp(mixed_model(point)) - exp(kde.score(point)))**2
sum_error += error
return sum_error
# How can I add constraints:
# f1+f2 <= 1
# rho1,2,3 <= 1
result = sp.optimize.minimize(square_error)
But I don't know how to add constrains in minimize
method. The example in http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize is very hard to understand.
UPDATE: This is what I end up with,
result = sp.optimize.minimize(
square_error,
x0 = [0.2,0.5,
1,1,1,1,0.3,
1,1,1,1,0.3,
1,1,1,1,0.3,],
bounds = [(0., 1.),(0., 1.),
(None, None),(None, None),(0., None),(0., None),(0., 1.),
(None, None),(None, None),(0., None),(0., None),(0., 1.),
(None, None),(None, None),(0., None),(0., None),(0., 1.),])
But it gives me TypeError: square_error() takes exactly 17 arguments (1 given)
, what's the problem?
You can add only add bounds if the solver supports them, so only for method='L-BFGS-B'
, TNC
and SLSQP
.
The bounds are passed via a sequence of (min, max)
tuples whose length corresponds to the number of your parameters. An example for fitting with 3 parameters would be:
result = sp.optimize.minimize(
square_error,
method='L-BFGS-B',
bounds=[(0., 5.), (None, 1.e4), (None, None)])
Here, None
corresponds to no bound.
I'm afraid that constraints on a combination of parameters such as f1+f2 <= 1
in your example is not possible within the framework of bounds
in scipy.minimize
.
You can, however, simply return np.inf
in your cost function if your bounds are violated. I'm not sure about the stability of this, though:
def square_error(f1,f2, other_args):
if f1+f2 <= 1:
return np.inf
# rest of the cost function
Moreover, I'd suggest to use the python implementation of multivariate Gaussians instead of creating them from scratch. That will speed up your fitting, help to avoid bugs and be more readable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With