I have a list of numbers in Python, like this: <pre class="prettyprint"><code>x = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] </code></pre> What's the best way to find the trend in these numbers? I'm not interested in predicting what the next number will be, I just want to output the trend for many sets of numbers so that I can compare the trends. Edit: By trend, I mean that I'd like a numerical representation of whether the numbers are increasing or decreasing and at what rate. I'm not massively mathematical, so there's probably a proper name for this! Edit 2: It looks like what I really want is the co-efficient of the linear best fit. What's the best way to get this in Python?

The Link provided by Keith or probably the answer from Riaz might help you to get the poly fit, but it is always recommended to use libraries if available, and for the problem in your hand, numpy provides a wonderful polynomial fit function called polyfit . You can use polyfit to fit the data over any degree of equation. Here is an example using numpy to fit the data in a linear equation of the form y=ax+b <pre class="prettyprint"><code>>>> data = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] >>> x = np.arange(0,len(data)) >>> y=np.array(data) >>> z = np.polyfit(x,y,1) >>> print "{0}x + {1}".format(*z) 4.32527472527x + 17.6 >>> </code></pre> similarly a quadratic fit would be <pre class="prettyprint"><code>>>> print "{0}x^2 + {1}x + {2}".format(*z) 0.311126373626x^2 + 0.280631868132x + 25.6892857143 >>> </code></pre>

Python: Finding a trend in a set of numbers

Tags:

python

math

I have a list of numbers in Python, like this:

x = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81]

What's the best way to find the trend in these numbers? I'm not interested in predicting what the next number will be, I just want to output the trend for many sets of numbers so that I can compare the trends.

Edit: By trend, I mean that I'd like a numerical representation of whether the numbers are increasing or decreasing and at what rate. I'm not massively mathematical, so there's probably a proper name for this!

Edit 2: It looks like what I really want is the co-efficient of the linear best fit. What's the best way to get this in Python?

774

asked Apr 06 '12 19:04

Sam Starling

2 Answers

Possibly you mean you want to plot these numbers on a graph and find a straight line through them where the overall distance between the line and the numbers is minimized? This is called a linear regression

def linreg(X, Y):     """     return a,b in solution to y = ax + b such that root mean square distance between trend line and original points is minimized     """     N = len(X)     Sx = Sy = Sxx = Syy = Sxy = 0.0     for x, y in zip(X, Y):         Sx = Sx + x         Sy = Sy + y         Sxx = Sxx + x*x         Syy = Syy + y*y         Sxy = Sxy + x*y     det = Sxx * N - Sx * Sx     return (Sxy * N - Sy * Sx)/det, (Sxx * Sy - Sx * Sxy)/det   x = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] a,b = linreg(range(len(x)),x)  //your x,y are switched from standard notation

The trend line is unlikely to pass through your original points, but it will be as close as possible to the original points that a straight line can get. Using the gradient and intercept values of this trend line (a,b) you will be able to extrapolate the line past the end of the array:

extrapolatedtrendline=[a*index + b for index in range(20)] //replace 20 with desired trend length

answered Sep 21 '22 19:09

Riaz Rizvi

The Link provided by Keith or probably the answer from Riaz might help you to get the poly fit, but it is always recommended to use libraries if available, and for the problem in your hand, numpy provides a wonderful polynomial fit function called polyfit . You can use polyfit to fit the data over any degree of equation.

Here is an example using numpy to fit the data in a linear equation of the form y=ax+b

>>> data = [12, 34, 29, 38, 34, 51, 29, 34, 47, 34, 55, 94, 68, 81] >>> x = np.arange(0,len(data)) >>> y=np.array(data) >>> z = np.polyfit(x,y,1) >>> print "{0}x + {1}".format(*z) 4.32527472527x + 17.6 >>>

similarly a quadratic fit would be

>>> print "{0}x^2 + {1}x + {2}".format(*z) 0.311126373626x^2 + 0.280631868132x + 25.6892857143 >>>

answered Sep 19 '22 19:09

Abhijit

Related questions
                            
                                PyCharm: Configuring multi-hop remote Interpreters via SSH
                            
                                Plotting implicit equations in 3d
                            
                                What's the most Pythonic way to identify consecutive duplicates in a list?
                            
                                Creating lambda inside a loop [duplicate]
                            
                                python re.split() to split by spaces, commas, and periods, but not in cases like 1,000 or 1.50
                            
                                How do __enter__ and __exit__ work in Python decorator classes?
                            
                                Is there any way to output requirements.txt automatically?
                            
                                Python inheritance - how to disable a function
                            
                                Python using methods from other classes
                            
                                How do I install pyspark for use in standalone scripts?
                            
                                Log in user using either email address or username in Django
                            
                                Get U, Sigma, V* matrix from Truncated SVD in scikit-learn
                            
                                [] = (), () = (), and {} = () 'assignments'
                            
                                python: NameError：global name '...‘ is not defined [duplicate]
                            
                                Clicking on a link via selenium
                            
                                Python: how to convert a dictionary into a subscriptable array?
                            
                                a = open("file", "r"); a.readline() output without \n [duplicate]
                            
                                How do I use numba on a member function of a class?
                            
                                Is there a better way to write nested if statements in python? [closed]
                            
                                How can I access Amazon DynamoDB via Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With