Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

extrapolating data with numpy/python

Let's say I have a simple data set. Perhaps in dictionary form, it would look like this:

{1:5, 2:10, 3:15, 4:20, 5:25}

(the order is always ascending). What I want to do is logically figure out what the next point of data is most likely to be. In the case, for example, it would be {6: 30}

what would be the best way to do this?

like image 474
corvid Avatar asked Oct 16 '13 14:10

corvid


People also ask

How do I extrapolate data?

To do this, the researcher plots out a linear equation on a graph and uses the sequence of the values to predict immediate future data points. You can draw a tangent line at the last point and extend this line beyond its limits.

What is extrapolation in Python?

Interpolation refers to the process of generating data points between already existing data points. Extrapolation is the process of generating points outside a given set of known data points.

How do you extrapolate a straight line in Python?

First, separate x and y points. Then we can use np. polyfit to fit a line to these points. A straight line can be represented with y = mx + b which is a polynomial of degree 1 .


1 Answers

You can also use numpy's polyfit:

data = np.array([[1,5], [2,10], [3,15], [4,20], [5,25]])
fit = np.polyfit(data[:,0], data[:,1] ,1) #The use of 1 signifies a linear fit.

fit
[  5.00000000e+00   1.58882186e-15]  #y = 5x + 0

line = np.poly1d(fit)
new_points = np.arange(5)+6

new_points
[ 6, 7, 8, 9, 10]

line(new_points)
[ 30.  35.  40.  45.  50.]

This allows you to alter the degree of the polynomial fit quite easily as the function polyfit take thes following arguments np.polyfit(x data, y data, degree). Shown is a linear fit where the returned array looks like fit[0]*x^n + fit[1]*x^(n-1) + ... + fit[n-1]*x^0 for any degree n. The poly1d function allows you turn this array into a function that returns the value of the polynomial at any given value x.

In general extrapolation without a well understood model will have sporadic results at best.


Exponential curve fitting.

from scipy.optimize import curve_fit

def func(x, a, b, c):
    return a * np.exp(-b * x) + c

x = np.linspace(0,4,5)
y = func(x, 2.5, 1.3, 0.5)
yn = y + 0.2*np.random.normal(size=len(x))

fit ,cov = curve_fit(func, x, yn)
fit
[ 2.67217435  1.21470107  0.52942728]         #Variables

y
[ 3.          1.18132948  0.68568395  0.55060478  0.51379141]  #Original data

func(x,*fit)
[ 3.20160163  1.32252521  0.76481773  0.59929086  0.5501627 ]  #Fit to original + noise
like image 140
Daniel Avatar answered Oct 13 '22 04:10

Daniel