Python Seasonal decompose Freq paramater determination

Tags:

Although the question seems to have been tackled a lot, I cannot figure out why seasonal decompose doesn't work in my case although I am giving as input a dataframe with a Datetime Index. Here is an example of my dataset:

    Customer order actual date  Sales Volumes
0   01/01/1900                           300
1   10/03/2008                          3000
2   15/11/2013                            10
3   23/12/2013                           200
4   04/03/2014                             5
5   17/03/2014                            30
6   22/04/2014                             1
7   26/06/2014                           290
8   30/06/2014                            40

the code snippet is shown below:

from statsmodels.tsa.seasonal import seasonal_decompose
df_agg['Customer order actual date'] = pd.to_datetime(df_agg['Customer order actual date'])
df_agg = df_agg.set_index('Customer order actual date')
df_agg.reset_index().sort_values('Customer order actual date', ascending=True)
decomposition = seasonal_decompose(np.asarray(df_agg['Sales Volumes'] ), model = 'multiplicative')

But I get systematically the following error:

: You must specify a freq or x must be a pandas object with a timeseries index witha freq not set to None

Could you please explain why I should give a freq input although I am using a dataframe with Datetime Index? Does it make sense to give a frequency as an input paramater whereas I am looking for the seasonality as an output of seasonal_decompose?

356

asked May 31 '18 05:05

Galileo

3 Answers

The seasonal_decompose function gets the frequency through inferred_freq. Here is the link - https://pandas-docs.github.io/pandas-docs-travis/generated/pandas.DatetimeIndex.html

Inferred_freq on other hand is generated by infer_freq and Infer_freq uses the values of the series and not the index. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.infer_freq.html

This might be a reason why freq needs to be set to a value even with a timeseries index.

And in case you want to know what frequency is in seasonal_decompose() - It is the property of your data. So if you collected your data month by month, then it has monthly frequency.

The method used in seasonal_decompose() to calculate frequency is: _maybe_get_pandas_wrapper_freq().

I did some research on seasonal_decompose() and here are the links which might help you in understanding the function's source code-

source code of seasonal decomposition - https://github.com/statsmodels/statsmodels/blob/master/statsmodels/tsa/seasonal.py

Check out - _maybe_get_pandas_wrapper_freq https://searchcode.com/codesearch/view/86129760/

Hope this helps! Let me know if you find something interesting in addition to it.

answered Oct 27 '22 00:10

Analyst17

Two points on your code snippet.

On line 4 of your code you are reseting the index, but you are not assigning it to a value, if you want to do it in place, you should add inplace=True
seasonal decompose works on timeseries, so your data needs to have a date time index. (you can do it either while loading the csv, or you can use pd.to_datetime() function.

answered Oct 27 '22 00:10

yosemite_k

First of all, if you hand an np.asarray(...) to seasonal_decompose, it will see only an array, your index is gone. So get rid of the np.asarray.

Secondly, if you look at df_agg['Sales Volumes'].index you will see that freq=None - that's what causes the function to complain. You need an existing frequency like D, M, whatever. You can achieve a frequency by setting it via df_agg.asfreq('D').

Last, but not least: your sample data are not following any frequency - asfreq will fill them up - but you get lots of NaN.

If you want to look up the abbreviations for freqs, they are here.

answered Oct 26 '22 23:10

Rriskit

Related questions
                            
                                Define a pytest fixture providing multiple arguments to test function
                            
                                how do I safely write data from a single hdf5 file to multiple files in parallel in python?
                            
                                GridSearchCV - save result each iteration
                            
                                Purpose of __name__ in TypeVar, NewType
                            
                                Python requests module doesn't return full page during get request
                            
                                Exception " There is no current event loop in thread 'MainThread' " while running over new loop
                            
                                Only one line of SimpleHTTPServer output does not appear while running container without '-it'
                            
                                [Tensorflow][Object detection] ValueError when try to train with --num_clones=2
                            
                                Understanding multi-label classifier using confusion matrix
                            
                                marshmallow flatten nested objects
                            
                                Returning mutiple values in the input function for `tf.py_func`
                            
                                Parsing Index page in a PDF text book with Python
                            
                                python rq - how to trigger a job when multiple other jobs are finished? Multi job dependency work arround?
                            
                                Is this time series stationary or not?
                            
                                Python lexical analysis - logical line & compound statements
                            
                                Python (openpyxl) : Put data from one excel file to another (template file) & save it with another name while retaining the template
                            
                                which one is effecient, join queries using sql, or merge queries using pandas?
                            
                                Pandas DataFrame column numerical integration
                            
                                Numpy: find indeces of mask edges
                            
                                How can I unlink account between Actions on Google and Auth0

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Seasonal decompose Freq paramater determination

Tags:

python

statsmodels