Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R convert between zoo object and data frame, results inconsistent for different numbers of columns?

Tags:

dataframe

r

zoo

I have difficulty switching between data frames and zoo objects, particularly keeping meaningful column names, and inconsistencies between univariate and multivariate cases:

library(zoo)

#sample data, two species counts over time
t = as.Date(c("2012-01-01", "2012-01-02", "2012-01-03", "2012-01-04"))
n1 = c(4, 5, 9, 7)  #counts of Lepisma saccharina
n2 = c(2, 6, 0, 11) #counts of Thermobia domestica
df = data.frame(t, n1, n2)
colnames(df) <- c("Date", "Lepisma saccharina", "Thermobia domestica")

#converting to zoo loses column names in univariate case...
> z1 <- read.zoo(df[,1:2]) #time series for L. saccharina
> colnames(z1)
NULL
> colnames(z1) <- c("Lepisma saccharina") #can't even set column name manually
Error in `colnames<-`(`*tmp*`, value = "Lepisma saccharina") : 
  attempt to set colnames on object with less than two dimensions
#... but not in multivariate case
> z2 <- read.zoo(df) #time series for both species
> colnames(z2)
[1] "Lepisma saccharina"  "Thermobia domestica"

To go back from a zoo object to a data frame in the original format, it's not enough to use as.data.frame since it won't include a Date column (the dates end up in the rownames): more work is needed.

zooToDf <- function(z) {
    df <- as.data.frame(z) 
    df$Date <- time(z) #create a Date column
    rownames(df) <- NULL #so row names not filled with dates
    df <- df[,c(ncol(df), 1:(ncol(df)-1))] #reorder columns so Date first
    return(df)
}

This works great on the multivariate case, but clearly can't recover a meaningful column name in the univariate case:

> df2b <- zooToDf(z2)
> df2b
        Date Lepisma saccharina Thermobia domestica
1 2012-01-01                  4                   2
2 2012-01-02                  5                   6
3 2012-01-03                  9                   0
4 2012-01-04                  7                  11

> df1b <- zooToDf(z1)
> df1b
        Date z
1 2012-01-01 4
2 2012-01-02 5
3 2012-01-03 9
4 2012-01-04 7

Is there a simple way to handle both univariate and multivariate cases? It seems z1 needs to remember the column name somehow.

like image 217
Silverfish Avatar asked Dec 28 '12 03:12

Silverfish


3 Answers

To convert from data frame to zoo use read.zoo:

library(zoo)
z <- read.zoo(df)

Also note the availability of the drop and other arguments in ?read.zoo .

and to convert from zoo to data frame, including the index, use fortify.zoo:

fortify.zoo(z, name = "Date")

(If ggplot2 is loaded then you can just use fortify.)

As mentioned in the comments below the question, the question as well as some of the other answers are either outdated or have some significant misunderstandings. Suggest you review https://cran.r-project.org/web/packages/zoo/vignettes/zoo-design.pdf which discusses the design philosophy of zoo which includes consistency with R itself. Certainly zoo would be a lot harder to use if you had to remember one set of defaults for R and another for zoo.

like image 106
G. Grothendieck Avatar answered Nov 03 '22 23:11

G. Grothendieck


If you don't want to drop dimensions, use drop=FALSE:

R> (z1 <- read.zoo(df[,1:2], drop=FALSE))
           Lepisma saccharina
2012-01-01                  4
2012-01-02                  5
2012-01-03                  9
2012-01-04                  7

You can do something like write.zoo if you want to include the zoo index as a column in your data.frame:

zoo.to.data.frame <- function(x, index.name="Date") {
  stopifnot(is.zoo(x))
  xn <- if(is.null(dim(x))) deparse(substitute(x)) else colnames(x)
  setNames(data.frame(index(x), x, row.names=NULL), c(index.name,xn))
}

UPDATE:

After trying to edit your question for brevity, I thought of an easy way to create df2b to your specifications (this will also work for z1 if you don't drop dimensions):

R> (df2b <- data.frame(Date=time(z2), z2, check.names=FALSE, row.names=NULL))
        Date Lepisma saccharina Thermobia domestica
1 2012-01-01                  4                   2
2 2012-01-02                  5                   6
3 2012-01-03                  9                   0
4 2012-01-04                  7                  11
like image 28
Joshua Ulrich Avatar answered Nov 03 '22 21:11

Joshua Ulrich


There's a newer simple solution to this using the timetk package. It will convert several time series formats, including xts and zoo, to tibbles. Simply wrap in as.data.frame to get a data frame.

timetk::tk_tbl(zoo::read.zoo(df))
# A tibble: 4 x 3
  index      `Lepisma saccharina` `Thermobia domestica`
  <date>                    <dbl>                 <dbl>
1 2012-01-01                    4                     2
2 2012-01-02                    5                     6
3 2012-01-03                    9                     0
4 2012-01-04                    7                    11
like image 31
hmhensen Avatar answered Nov 03 '22 22:11

hmhensen