Adding a column to an xts object is straightforward if you know the name of the column ahead of time. For example, to add a column named "b":
n <- 5
x <- merge(xts(order.by = as.Date('2015-1-1') + 1:n), a = rnorm(n))
x$b <- rnorm(n)
Adding a dynamically-named column (i.e., a column whose name is known only at runtime) is harder:
new.col.name <- 'c' # known only at runtime
x[, new.col.name] <- rnorm(n) # this generates an error
One approach is to add a column with a temporary name and then rename it:
stopifnot(!('tmp' %in% names(x)))
x$tmp <- rnorm(n)
names(x)[names(x) == 'tmp'] <- new.col.name
Is there a better way to do this? (Also, does assigning to names
of an xts object result in a copy of the object being made? So, for example, would the above approach work well if n
were very large?)
As xts objects are arrays, getting apply functions to work is a little tricky if you want to preserve the dates. For example, take the xts object xx below: Say we wish to apply a function to each column (to keep it simple say i wish to add 100 to each element of each column). Doing with with sapply loses the row names.
The function " [<-" copies the xz object and replaces all its values. Alternatively, you can create a new xts object based on the matrix returned by vapply and the time information in your original xts object. As you can see, the approach with vapply and " [<-" is the fastest one.
Since all columns are numeric, the function returns a numeric vector of length nrow (xz) for each column of xz. Unfortunately, vapply does not preserve the dates of the xts object. You can use the following command to generate a new object based on xz and replace all values with the matrix returned by vapply.
As you can see, the approach with vapply and " [<-" is the fastest one. An important information: if the function you want to apply to each column is a mathematical operation, you can apply it to the whole xts object at once, e.g., xz + 100.
The easiest/clearest thing to do is merge the original object with the new column(s), after you convert the new column(s) to a matrix (so you can set the column name).
set.seed(21)
newData <- rnorm(n)
x1 <- merge(x, matrix(newData, ncol=1, dimnames=list(NULL, new.col.name)))
# another way to do the same thing
dim(newData) <- c(nrow(x), 1)
colnames(newData) <- new.col.name
x2 <- merge(x, newData)
To answer your second question: yes, assigning names (and colnames) on an xts object creates a copy. You can see it does by using tracemem
and the output from gc
.
> R -q # new R session
R> x <- xts::.xts(1:1e6, 1:1e6)
R> tracemem(x)
[1] "<0x2892400>"
R> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 259260 13.9 592000 31.7 350000 18.7
Vcells 1445207 11.1 4403055 33.6 3445276 26.3
R> colnames(x) <- "hi"
tracemem[0x2892400 -> 0x24c1ad0]:
tracemem[0x24c1ad0 -> 0x2c62d30]: colnames<-
tracemem[0x2c62d30 -> 0x3033660]: dimnames<-.xts dimnames<- colnames<-
tracemem[0x3033660 -> 0x3403f90]: dimnames<-.xts dimnames<- colnames<-
tracemem[0x3403f90 -> 0x37d48c0]: colnames<- dimnames<-.xts dimnames<- colnames<-
tracemem[0x37d48c0 -> 0x3033660]: dimnames<-.xts dimnames<- colnames<-
R> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 259696 13.9 592000 31.7 350000 18.7
Vcells 1445750 11.1 4403055 33.6 3949359 30.2
R> print(object.size(x), units="Mb")
7.6 Mb
You can see the colnames<-
call causes ~4MB of extra memory to be used (the "max used (Mb)" increased by that amount). The entire xts object is ~8MB, half of which is the coredata
and the other half is the index
. So the 4MB of extra memory used is to copy the coredata
.
If you want to avoid the copy, you can set it manually. But be careful, because you could do something that would otherwise be caught by the "checks" in colnames<-.xts
.
> R -q # new R session
R> x <- xts::.xts(1:1e6, 1:1e6)
R> tracemem(x)
[1] "<0x2cc5330>"
R> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 256397 13.7 592000 31.7 350000 18.7
Vcells 1440915 11.0 4397699 33.6 3441761 26.3
R> attr(x, 'dimnames') <- list(NULL, "hi")
tracemem[0x2cc5330 -> 0x28f4a00]:
R> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 256403 13.7 592000 31.7 350000 18.7
Vcells 1440916 11.0 4397699 33.6 3441761 26.3
R> print(object.size(x), units="Mb")
7.6 Mb
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With