Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply non-linear regression for multi dimension data samples in Python

I have installed Numpy and SciPy, but I'm not quite understand their documentation about polyfit.

For exmpale, Here's my three data samples:

[-0.042780748663101636, -0.0040771571786609945, -0.00506567946276074]
[0.042780748663101636, -0.0044771571786609945, -0.10506567946276074]
[0.542780748663101636, -0.005771571786609945, 0.30506567946276074]
[-0.342780748663101636, -0.0304077157178660995, 0.90506567946276074]

The first two columns are sample features, the third column is output, My target is to get a function that could take two parameters(first two columns) and return its prediction(the output).

Any simple example ?

====================== EDIT ======================

Note that, I need to fit something like a curve, not only straight lines. The polynomial should be something like this ( n = 3):

a*x1^3 + b*x2^2 + c*x3 + d = y

Not:

a*x1 + b*x2 + c*x3 + d = y

x1, x2, x3 are features of one sample, y is the output

like image 355
WoooHaaaa Avatar asked Feb 16 '23 21:02

WoooHaaaa


1 Answers

Try something like

edit: added an example function that used results of linear regression to estimate output.

import numpy as np
data =np.array(
[[-0.042780748663101636, -0.0040771571786609945, -0.00506567946276074],
[0.042780748663101636, -0.0044771571786609945, -0.10506567946276074],
[0.542780748663101636, -0.005771571786609945, 0.30506567946276074],
[-0.342780748663101636, -0.0304077157178660995, 0.90506567946276074]])

coefficient = data[:,0:2]
dependent = data[:,-1]

x,residuals,rank,s = np.linalg.lstsq(coefficient,dependent)

def f(x,u,v):
    return u*x[0] + v*x[1]

for datum in data:
    print f(x,*datum[0:2])

Which gives

>>> x
array([  0.16991146, -30.18923739])
>>> residuals
array([ 0.07941146])
>>> rank
2
>>> s
array([ 0.64490113,  0.02944663])

and the function created with your coefficients gave

0.115817326583
0.142430900298
0.266464019171
0.859743371665

More info can be found at the documentation I posted as a comment.

edit 2: fitting your data to an arbitrary model.

edit 3: made my model a function for ease of understanding.

edit 4: made code more easily read/ changed model to a quadratic fit, but you should be able to read this code and know how to make it minimize any residual you want now.

contrived example:

import numpy as np
from scipy.optimize import leastsq

data =np.array(
[[-0.042780748663101636, -0.0040771571786609945, -0.00506567946276074],
[0.042780748663101636, -0.0044771571786609945, -0.10506567946276074],
[0.542780748663101636, -0.005771571786609945, 0.30506567946276074],
[-0.342780748663101636, -0.0304077157178660995, 0.90506567946276074]])

coefficient = data[:,0:2]
dependent = data[:,-1]

def model(p,x):
    a,b,c = p
    u = x[:,0]
    v = x[:,1]
    return (a*u**2 + b*v + c)

def residuals(p, y, x):
    a,b,c = p
    err = y - model(p,x)
    return err

p0 = np.array([2,3,4]) #some initial guess

p = leastsq(residuals, p0, args=(dependent, coefficient))[0]

def f(p,x):
    return p[0]*x[0] + p[1]*x[1] + p[2]

for x in coefficient:
    print f(p,x)

gives

-0.108798280153
-0.00470479385807
0.570237823475
0.413016072653
like image 147
seth Avatar answered Feb 18 '23 13:02

seth