Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Row wise matrix operations in R

Tags:

r

data.table

Recently I ran into the data.table package. I'm still not sure how to do row-wise matrix operations. Was it originally intended to handle such operations? For example, what would be data.table equivalent to apply(M,1,fun)?

fun should take a vector as argument, for example mean, median, or mad.

like image 409
danas.zuokas Avatar asked Mar 05 '12 09:03

danas.zuokas


People also ask

What is row wise in R?

rowwise.Rd. rowwise() allows you to compute on a data frame a row-at-a-time. This is most useful when a vectorised function doesn't exist. Most dplyr verbs preserve row-wise grouping. The exception is summarise() , which return a grouped_df.

Can R do matrix operations?

There are multiple matrix operations that you can perform in R. This include: addition, substraction and multiplication, calculating the power, the rank, the determinant, the diagonal, the eigenvalues and eigenvectors, the transpose and decomposing the matrix by different methods.

How do I select a row in R matrix?

You should therefore use a comma to separate the rows you want to select from the columns. For example: my_matrix[1,2] selects the element at the first row and second column. my_matrix[1:3,2:4] results in a matrix with the data on the rows 1, 2, 3 and columns 2, 3, 4.

How do you calculate row wise sum in R?

Row wise sum of the dataframe using dplyr: Method 1 rowSums() function takes up the columns 2 to 4 and performs the row wise operation with NA values replaced to zero. row wise sum is performed using pipe (%>%) operator of the dplyr package.


1 Answers

I think you are looking for the := operator (see ?':='). A short example and a comparison with the mapply function is below (I hope I apply the mapply function correctly; I'm only using data.tables nowadays, so no promise on that; but still, the data.table way is fast and in my opinion easy to memorize):

library(data.table)
> df <-     data.frame(ID = 1:1e6,
+                     B  = rnorm(1e6),
+                     C  = rnorm(1e6))
> system.time(x <- mapply(foo, df$B, df$C))
   user  system elapsed 
   4.32    0.04    4.38 
> DT <- as.data.table(df)
> system.time(DT[, D := foo(B, C)])
   user  system elapsed 
   0.02    0.00    0.02 
> all.equal(x, DT[, D])
[1] TRUE

After posting my answer, I'm not so sure anymore if this is what you are looking for. I hope it does, just give more details if it doesn't (for instance, do you have many columns you want to apply a function to, not just the two in my example?). Anyways, this SO post could be of interest for you.

like image 86
Christoph_J Avatar answered Nov 01 '22 22:11

Christoph_J