Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify the best frequency in a time series?

I have a database metrics grouped by day, and I need to forecast the data for the next 3 months. These data have seasonality, (I believe that the seasonality is by days of the week).

I want to use the Holt Winters method using R, I need to create a time series object, which asks for frequency, (That I think is 7). But how can I know if I'm sure? Have a function to identify the best frequency?

I'm using:

FID_TS <- ts(FID_DataSet$Value, frequency=7)

FID_TS_Observed <- HoltWinters(FID_TS)

If I decompose this data with decompose(FID_TS), I have:

enter image description here

And this is my first forecast FID_TS_Observed:

enter image description here

When I look at the history of the last year, they starts low in the first 3 months and increase from month 3 to 11, when they decrease again.

Maybe my daily data, have a daily have a weekly seasonality (frequency=7) and an monthly seasonality (frequency=7x30=210)? I need the last 365 days?

Have any way to put the frequency by day of the week and by month? Another thing, does it make any difference I take the whole last year or just a part of it to use in the Holt-Winters method?

Thanks in advance :)

like image 742
Evan Bessa Avatar asked Nov 08 '22 09:11

Evan Bessa


1 Answers

Usually, the frequency (or seasonality, you seem to be using the words interchangeably in your post) is determined by domain knowledge. For example if I am working in the restaurant business, and I am analyzing an hourly data set of customers, I know that I will have a 24 hour frequency, with spikes during lunch time and dinner time, and another 168 hour frequency (24 * 7) because there will be a weekly pattern to my customers.

If for some reason, you don't have domain knowledge, you can use the ACF and the PACF, as well as Fourrier analysis to finds the best frequencies for your data.

Have any way to put the frequency by day of the week and by month?

With Holt-Winters, no. HW takes only one seasonal component. For multiple seasonal components, you should try TBATS. As Xiaoxi Wu pointe out, FB Prophet can model multiple seasonalities, and Google's BSTS package can as well.

Another thing, does it make any difference I take the whole last year or just a part of it to use in the Holt-Winters method?

Yes it does. I you want to model a seasonality, then you need at least two times the seasonal period to be able to model it (preferably more), otherwise your model has no way of knowing whether a spike is a seasonal variation or just a one time impulse. So for example to model a weekly seasonality, you need at least 14 days of training data (plus whatever you will use for testing, and for a yearly seasonality, you will need at least 730 days of data, etc....

like image 175
Alex Kinman Avatar answered Nov 14 '22 22:11

Alex Kinman