I have a list of numbers in Python, like this:
x = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81]
What's the best way to find the trend in these numbers? I'm not interested in predicting what the next number will be, I just want to output the trend for many sets of numbers so that I can compare the trends.
Edit: By trend, I mean that I'd like a numerical representation of whether the numbers are increasing or decreasing and at what rate. I'm not massively mathematical, so there's probably a proper name for this!
Edit 2: It looks like what I really want is the co-efficient of the linear best fit. What's the best way to get this in Python?
To calculate the trend percentage for the second year, divide the dollar amount in the second year by the dollar amount in the base year, and then multiply the result by 100. For instance, say your small company had $30,000, $40,000 and $25,000 in cash in the years 2017, 2018 and 2019, respectively.
Interpreting Google TrendsThe numbers represent the search interest relative to the highest point on the chart for the selected region and time. A value of 100 is the peak popularity of the term, whilst a value of 50 means that the term is half as popular.
Possibly you mean you want to plot these numbers on a graph and find a straight line through them where the overall distance between the line and the numbers is minimized? This is called a linear regression
def linreg(X, Y): """ return a,b in solution to y = ax + b such that root mean square distance between trend line and original points is minimized """ N = len(X) Sx = Sy = Sxx = Syy = Sxy = 0.0 for x, y in zip(X, Y): Sx = Sx + x Sy = Sy + y Sxx = Sxx + x*x Syy = Syy + y*y Sxy = Sxy + x*y det = Sxx * N - Sx * Sx return (Sxy * N - Sy * Sx)/det, (Sxx * Sy - Sx * Sxy)/det x = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] a,b = linreg(range(len(x)),x) //your x,y are switched from standard notation
The trend line is unlikely to pass through your original points, but it will be as close as possible to the original points that a straight line can get. Using the gradient and intercept values of this trend line (a,b) you will be able to extrapolate the line past the end of the array:
extrapolatedtrendline=[a*index + b for index in range(20)] //replace 20 with desired trend length
The Link provided by Keith or probably the answer from Riaz might help you to get the poly fit, but it is always recommended to use libraries if available, and for the problem in your hand, numpy provides a wonderful polynomial fit function called polyfit . You can use polyfit to fit the data over any degree of equation.
Here is an example using numpy to fit the data in a linear equation of the form y=ax+b
>>> data = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] >>> x = np.arange(0,len(data)) >>> y=np.array(data) >>> z = np.polyfit(x,y,1) >>> print "{0}x + {1}".format(*z) 4.32527472527x + 17.6 >>>
similarly a quadratic fit would be
>>> print "{0}x^2 + {1}x + {2}".format(*z) 0.311126373626x^2 + 0.280631868132x + 25.6892857143 >>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With