Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

stl() decomposition won't accept univariate ts object?

Tags:

r

I'm have issues with stl() time series decomposition function in R telling me my ts object is not univariate when it actually is?

tsData <- ts(data = dummyData, start = c(2012,1), end = c(2014,12), frequency = 12)

> tsData
     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2012  22  26  34  33  40  39  39  45  50  58  64  78
2013  51  60  80  80  93 100  96 108 111 119 140 164
2014 103 112 154 135 156 170 146 156 166 176 193 204

> class(tsData)
[1] "ts"

> stl(tsData, s.window = "periodic")
Error in stl(tsData, s.window = "periodic") : 
  only univariate series are allowed

> dput(dummyData)
structure(list(index = c(22L, 26L, 34L, 33L, 40L, 39L, 39L, 45L, 
50L, 58L, 64L, 78L, 51L, 60L, 80L, 80L, 93L, 100L, 96L, 108L, 
111L, 119L, 140L, 164L, 103L, 112L, 154L, 135L, 156L, 170L, 146L, 
156L, 166L, 176L, 193L, 204L)), .Names = "index", class = "data.frame", row.names = c(NA, 
-36L))

Anyone know how to fix this issue?

like image 666
moku Avatar asked Dec 01 '22 18:12

moku


2 Answers

To avoid these kinds of problems or errors try to make a univariate time series just by forming the raw data points or values, calling ts() function.

Better speaking you should always put only the values of your variable not the whole structure of the variable. Let me explain it a little bit by a very simple example:

Imagine you have a variable X which is a vector (most probably imported or formed from the other data sources)by a 100x1 dimension, i.e. it contains 100 values or data points. If you want to make a univariate time series out of this vector the wrong way to do it is as like as you did for your case:

ts(X, frequency=24)

BE CAREFUL, the CORRECT way to do it is like this:

ts(X[1:100], frequency=24)

or even like this:

ts(X[1:100,1], frequency=24)

I hope my dear friend that you can avoid it for the next time you need to make a univariate time series..!!

like image 126
Elias Avatar answered Dec 05 '22 13:12

Elias


I'm not 100% sure about what the exact cause of the problem is, but you can fix this by passing dummyData$index to ts instead of the entire object:

tsData2 <- ts(
  data=dummyData$index, 
  start = c(2012,1), 
  end = c(2014,12), 
  frequency = 12)
##
R>  stl(tsData2, s.window="periodic")
 Call:
 stl(x = tsData2, s.window = "periodic")

Components
            seasonal     trend   remainder
Jan 2012 -24.0219753  36.19189   9.8300831
Feb 2012 -20.2516062  37.82808   8.4235219
Mar 2012  -0.4812396  39.46428  -4.9830367
Apr 2012 -10.1034302  41.32047   1.7829612
May 2012   0.6077088  43.17666  -3.7843705
Jun 2012   4.4723800  45.22411 -10.6964877
Jul 2012  -7.6629462  47.27155  -0.6086074
Aug 2012  -1.0551286  49.50673  -3.4516016
Sep 2012   2.2193527  51.74191  -3.9612597
Oct 2012   7.3239448  55.27391  -4.5978509
Nov 2012  18.4285405  58.80591 -13.2344456
Dec 2012  30.5244146  63.70105 -16.2254684

...


I'm guessing that when you pass a data.frame to the data argument of ts, some extra attributes carry over, and although this generally doesn't seem to be an issue with many functions that take a ts class object (univariate or otherwise), apparently it is an issue for stl.

R>  all.equal(tsData2,tsData)
[1] "Attributes: < Names: 1 string mismatch >"                         
[2] "Attributes: < Length mismatch: comparison on first 2 components >"
[3] "Attributes: < Component 2: Numeric: lengths (3, 2) differ >"      
##
R>  str(tsData2)
 Time-Series [1:36] from 2012 to 2015: 22 26 34 33 40 39 39 45 50 58 ...
##
R>  str(tsData)
 'ts' int [1:36, 1] 22 26 34 33 40 39 39 45 50 58 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "index"
 - attr(*, "tsp")= num [1:3] 2012 2015 12

Edit:

Looking into this a little further, I think the problem has to do with the dimnames attribute being carried over from the dummyData when it is passed as a whole. Note this excerpt from the body of stl:

if (is.matrix(x)) 
        stop("only univariate series are allowed")

and from the definition of matrix:

is.matrix returns TRUE if x is a vector and has a "dim" attribute of length 2) and FALSE otherwise

so although you are passing stl a univariate time series (the original tsData), as far as the function is concerned, a vector with a length 2 dimnames attribute (i.e. a matrix) is not a univariate series. It seems a little strange to do error handling in this way, but I'm sure the author of the function had a very good reason for this.

like image 32
nrussell Avatar answered Dec 05 '22 11:12

nrussell