Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Apply() function on specific dataframe columns

Tags:

dataframe

r

apply

I want to use the apply function on a dataframe, but only apply the function to the last 5 columns.

B<- by(wifi,(wifi$Room),FUN=function(y){apply(y, 2, A)}) 

This applies A to all the columns of y

B<- by(wifi,(wifi$Room),FUN=function(y){apply(y[4:9], 2, A)}) 

This applies A only to columns 4-9 of y, but the total return of B strips off the first 3 columns... I still want those, I just don't want A applied to them.

wifi[,1:3]+B  

also does not do what I expected/wanted.

like image 320
skmathur Avatar asked Aug 29 '13 05:08

skmathur


People also ask

How do I apply a function on each column in R?

Apply any function to all R data frame You can set the MARGIN argument to c(1, 2) or, equivalently, to 1:2 to apply the function to each value of the data frame. If you set MARGIN = c(2, 1) instead of c(1, 2) the output will be the same matrix but transposed. The output is of class “matrix” instead of “data.

What does apply () do in R?

The apply() function lets us apply a function to the rows or columns of a matrix or data frame. This function takes matrix or data frame as an argument along with function and whether it has to be applied by row or column and returns the result in the form of a vector or array or list of values obtained.

How do I only put certain columns in R?

To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.


1 Answers

lapply is probably a better choice than apply here, as apply first coerces your data.frame to an array which means all the columns must have the same type. Depending on your context, this could have unintended consequences.

The pattern is:

df[cols] <- lapply(df[cols], FUN) 

The 'cols' vector can be variable names or indices. I prefer to use names whenever possible (it's robust to column reordering). So in your case this might be:

wifi[4:9] <- lapply(wifi[4:9], A) 

An example of using column names:

wifi <- data.frame(A=1:4, B=runif(4), C=5:8) wifi[c("B", "C")] <- lapply(wifi[c("B", "C")], function(x) -1 * x) 
like image 151
leif Avatar answered Sep 29 '22 09:09

leif