Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Finding a trend in a set of numbers

Tags:

python

math

I have a list of numbers in Python, like this:

x = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] 

What's the best way to find the trend in these numbers? I'm not interested in predicting what the next number will be, I just want to output the trend for many sets of numbers so that I can compare the trends.

Edit: By trend, I mean that I'd like a numerical representation of whether the numbers are increasing or decreasing and at what rate. I'm not massively mathematical, so there's probably a proper name for this!

Edit 2: It looks like what I really want is the co-efficient of the linear best fit. What's the best way to get this in Python?

like image 774
Sam Starling Avatar asked Apr 06 '12 19:04

Sam Starling


People also ask

How do you find the trend in numbers?

To calculate the trend percentage for the second year, divide the dollar amount in the second year by the dollar amount in the base year, and then multiply the result by 100. For instance, say your small company had $30,000, $40,000 and $25,000 in cash in the years 2017, 2018 and 2019, respectively.

What does trend in numbers mean?

Interpreting Google TrendsThe numbers represent the search interest relative to the highest point on the chart for the selected region and time. A value of 100 is the peak popularity of the term, whilst a value of 50 means that the term is half as popular.


2 Answers

Possibly you mean you want to plot these numbers on a graph and find a straight line through them where the overall distance between the line and the numbers is minimized? This is called a linear regression

def linreg(X, Y):     """     return a,b in solution to y = ax + b such that root mean square distance between trend line and original points is minimized     """     N = len(X)     Sx = Sy = Sxx = Syy = Sxy = 0.0     for x, y in zip(X, Y):         Sx = Sx + x         Sy = Sy + y         Sxx = Sxx + x*x         Syy = Syy + y*y         Sxy = Sxy + x*y     det = Sxx * N - Sx * Sx     return (Sxy * N - Sy * Sx)/det, (Sxx * Sy - Sx * Sxy)/det   x = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] a,b = linreg(range(len(x)),x)  //your x,y are switched from standard notation 

The trend line is unlikely to pass through your original points, but it will be as close as possible to the original points that a straight line can get. Using the gradient and intercept values of this trend line (a,b) you will be able to extrapolate the line past the end of the array:

extrapolatedtrendline=[a*index + b for index in range(20)] //replace 20 with desired trend length 
like image 81
Riaz Rizvi Avatar answered Sep 21 '22 19:09

Riaz Rizvi


The Link provided by Keith or probably the answer from Riaz might help you to get the poly fit, but it is always recommended to use libraries if available, and for the problem in your hand, numpy provides a wonderful polynomial fit function called polyfit . You can use polyfit to fit the data over any degree of equation.

Here is an example using numpy to fit the data in a linear equation of the form y=ax+b

>>> data = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] >>> x = np.arange(0,len(data)) >>> y=np.array(data) >>> z = np.polyfit(x,y,1) >>> print "{0}x + {1}".format(*z) 4.32527472527x + 17.6 >>>  

similarly a quadratic fit would be

>>> print "{0}x^2 + {1}x + {2}".format(*z) 0.311126373626x^2 + 0.280631868132x + 25.6892857143 >>>  
like image 28
Abhijit Avatar answered Sep 19 '22 19:09

Abhijit