I have a 58 column dataframe, I need to apply the transformation $log(x_{i,j}+1)$ to all values in the first 56 columns. What method could I use to go about this most efficiently? I'm assuming there is something that would allow me to do this rather than just using some for loops to run through the entire dataframe.
The apply() function is used to apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).
Apply any function to all R data frame You can set the MARGIN argument to c(1, 2) or, equivalently, to 1:2 to apply the function to each value of the data frame. If you set MARGIN = c(2, 1) instead of c(1, 2) the output will be the same matrix but transposed. The output is of class “matrix” instead of “data.
The apply() function lets us apply a function to the rows or columns of a matrix or data frame. This function takes matrix or data frame as an argument along with function and whether it has to be applied by row or column and returns the result in the form of a vector or array or list of values obtained.
apply() functionapply() takes Data frame or matrix as an input and gives output in vector, list or array. Apply function in R is primarily used to avoid explicit uses of loop constructs. It is the most basic of all collections can be used over a matrice. The simplest example is to sum a matrice over all the columns.
alexwhan's answer is right for log (and should probably be selected as the correct answer). However, it works so cleanly because log is vectorized. I have experienced the special pain of non-vectorized functions too frequently. When I started with R, and didn't understand the apply family well, I resorted to ugly loops very often. So, for the purposes of those who might stumble onto this question who do not have vectorized functions I provide the following proof of concept.
#Creating sample data df <- as.data.frame(matrix(runif(56 * 56), 56, 56)) #Writing an ugly non-vectorized function logplusone <- function(x) {log(x[1] + 1)} #example code that achieves the desired result, despite the lack of a vectorized function df[, 1:56] <- as.data.frame(lapply(df[, 1:56], FUN = function(x) {sapply(x, FUN = logplusone)})) #Proof that the results are the same using both methods... #Note: I used all.equal rather than all so that the values are tested using machine tolerance for mathematical equivalence. This is probably a non-issue for the current example, but might be relevant with some other testing functions. #should evaluate to true all.equal(log(df[, 1:56] + 1),as.data.frame(lapply(df[, 1:56], FUN = function(x) {sapply(x, FUN = logplusone)})))
You should be able to just refer to the columns you want, and do the operation, ie:
df[,1:56] <- log(df[,1:56]+1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With