Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for efficient multiple time series analysis

I have a large number of time series (>100) which differ in the sampling frequency and the time period for which they are available. Each time series has to be tested for unit roots and seasonally adjusted and other preliminary data transformations and checking etc.

As a large number of series have to be routinely checked, what is the solution to do it efficiently? The concern is to save time in the routine aspects and keep track of the series and analysis results. Unit root testing of the series for example is something subjective. How much of this type of analysis can be automated and how?

I have already read the questions regarding the statistical workflow which suggests having a common script to run on each series.

I am asking something more specific and based on experience of handling a multiple time series dataset. The focus is more on minimizing errors while dealing with so many series and also automating repetitive tasks.

like image 830
Anusha Avatar asked Oct 19 '11 22:10

Anusha


People also ask

Which method of time series analysis is most appropriate?

The correct answer is option B (Short). Time series analysis is most effective when used in short term forecasting because shorter time horizons are easier to forecast with more accuracy.

Which algorithm is best for time series forecasting?

The most popular statistical method for time series forecasting is the ARIMA (Autoregressive Integrated Moving Average) family with AR, MA, ARMA, ARIMA, ARIMAX, and SARIMAX methods.


1 Answers

I assume the series will be examined independently, as you've not mentioned any inter-relationships in the models. I'm not sure what kind of object you're looking to use or which tests, but the basic goal of "best practices" is independent of the actual package to be used.

The simplest approaches involve loading objects into a list and analyzing each series via simple iterators such as lapply or via multicore methods such as mclapply or foreach, in R. For Matlab, you can operate over cell arrays. The parallel computing toolbox has a function called parfor, for "parallel for", which is similar to the foreach function in R. For my money, I'd recommend using R as it's cheaper (free) and has a much richer functionality for statistical analyses. Matlab has better documentation and help tools, but these tend to matter less over time as you become more familiar with the tools and methods of your research (and as your bookshelf of references grows).

It's good to become accustomed to using multicore tools in general, as this can substantially decrease the time it takes to do analyses on a bunch of independent small objects.

like image 126
Iterator Avatar answered Sep 29 '22 12:09

Iterator