I have a dataframe
myDF
created like this:
a <- 1:4
b <- seq(3, 16, length=4)
myDF <- data.frame(a=a, b=b)
which looks like this:
a b
1 1 3.000000
2 2 7.333333
3 3 11.666667
4 4 16.000000
Now I want to divide subsequently predecessor and successor in each column, add the results to the existing dataframe, replace the one missing value in each column by NA
and add new column names. For the example above, my desired outcome looks like this:
a b amod bmod
1 1 3.000000 NA NA
2 2 7.333333 2.000000 2.444444
3 3 11.666667 1.500000 1.590909
4 4 16.000000 1.333333 1.371429
So, in column a
2 is divided by 1, 3 is divided by 2, and 4 is divided by 3 and the results are stored in amod
.
The way I do it now is like this:
divStuff <-function(aCol){
newCol <- aCol[2:length(aCol)]/aCol[1:length(aCol) - 1]
newCol <- c(NA, newCol)
return(newCol)
}
newDF <- data.frame(lapply(myDF, divStuff))
names(newDF) <- paste(names(myDF), "mod", sep="")
endDF <- cbind(myDF, newDF)
I wrote a function divStuff
which does the division and then call lapply
which applies this function to each column of the data frame.
Now I am wondering whether that is the way to do it or whether there is a smarter way on doing such kind of operations which would e.g. avoid the cbind
call or does the cbind
in a way which avoids the line newCol <- c(NA, newCol)
by adding a NA
automatically. I did not find a nice way, all solutions for that looks similar to this one.
Here's a quick data.table
version (using the devel version on GH)
library(data.table) ## V 1.9.5
setDT(myDF)[, paste0(names(myDF), "mod") := lapply(.SD, function(x) x/shift(x))]
# a b amod bmod
# 1: 1 3.000000 NA NA
# 2: 2 7.333333 2.000000 2.444444
# 3: 3 11.666667 1.500000 1.590909
# 4: 4 16.000000 1.333333 1.371429
Or similarly with dplyr
though you may want to play around with the column names (this is due a bug(?) in mutate_each
when it drops the original columns and doesn't rename the resulting ones when given a single function)
library(dplyr)
myDF %>%
mutate_each(funs(./lag(.))) %>%
cbind(myDF, .)
# a b a b
# 1 1 3.000000 NA NA
# 2 2 7.333333 2.000000 2.444444
# 3 3 11.666667 1.500000 1.590909
# 4 4 16.000000 1.333333 1.371429
With base R
:
myDF[,paste0(names(myDF), "mod")] <- sapply(myDF, function(x) c(NA, x[-1]/head(x,-1)))
# a b amod bmod
#1 1 3.000000 NA NA
#2 2 7.333333 2.000000 2.444444
#3 3 11.666667 1.500000 1.590909
#4 4 16.000000 1.333333 1.371429
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With