I have a Dataframe with around 80.000 observations taken every 15 min. The seasonal parameter m is assumed with 96, because every 24h the pattern repeats. When I insert these informations in my auto_arima algorithm, it takes a long time (some hours) until the following error message is given out:
MemoryError: Unable to allocate 5.50 GiB for an array with shape (99, 99, 75361) and data type float64
The code that I am using:
stepwise_fit = auto_arima(df['Hges'], seasonal=True, m=96, stepwise=True,
stationary=True, trace=True)
print(stepwise_fit.summary())
I tried it with resampling to hourly values, to reduce the amount of data and the m-factor to 24, but still my computer cannot calculate the result.
How do find the weighting factors with auto_arima when you deal with large data ?
I don't recall the exact source where I read this, but neither auto.arima nor pmdarima are really optimized to scale, which might explain the issues you are facing.
But there are some more important things to note about your question: With 80K data points at 15 minute intervals, ARIMA probably isn't the best type of model for your use case anyway:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With