Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to print the name of current row when using apply in R?

Tags:

r

statistics

For example, I have a matrix k

> k
  d e
a 1 3
b 2 4

I want to apply a function on k

> apply(k,MARGIN=1,function(p) {p+1})
a b
d 2 3
e 4 5

However, I also want to print the rowname of the row being apply so that I can know which row the function is applied on at that time.

It may looks like this:

apply(k,MARGIN=1,function(p) {print(rowname(p)); p+1})

But I really don't do how to do that in R. Does anyone has any idea?

like image 845
Hanfei Sun Avatar asked Jun 08 '12 22:06

Hanfei Sun


People also ask

How do I get the current row number in R?

Rownames can also be assigned to the rows in a dataframe using the rownames() method. It takes a vector of length equivalent to the number of rows in the dataframe. The rownames(df) can also be checked to compare a value and then return a row number which corresponds to it.

How do I get row names from a Dataframe in R?

A data frame's rows can be accessed using rownames() method in the R programming language. We can specify the new row names using a vector of numerical or strings and assign it back to the rownames() method.

How do I select a row by name in R?

By using df[rows,columns] approach lets select the rows by row name from the R data frame. In order to select the rows specify the rows option.


2 Answers

Here's a neat solution to what I think you're asking. (I've called the input matrix mat rather than k for clarity - in this example, mat has 2 columns and 10 rows, and the rows are named abc1 through to abc10.)

In the code below, the result out1 is the thing you wanted to calculate (the outcome of the apply command). The result out2 comes out identically to out1 except that it prints out the rownames that it is working on (I put in a delay of 0.3 seconds per row so you can see it really does do this - take this out when you want the code to run full speed obviously!)

The trick I came up with was to cbind the row numbers (1 to n) onto the left of mat (to create a matrix with one additional column), and then use this to refer back to the rownames of mat. Note the line x = y[-1] which means that the actual calculation within the function (here, adding 1) ignores the first column of row numbers, which means it's the same as the calculation done for out1. Whatever sort of calculation you want to perform on the rows can be done this way - just pretend that y never existed, and formulate your desired calculation using x. Hope this helps.

set.seed(1234)
mat = as.matrix(data.frame(x = rpois(10,4), y = rpois(10,4)))
rownames(mat) = paste("abc", 1:nrow(mat), sep="")
out1 = apply(mat,1,function(x) {x+1})
out2 = apply(cbind(seq_len(nrow(mat)),mat),1,
             function(y) {
                           x = y[-1]
                           cat("Doing row:",rownames(mat)[y[1]],"\n")
                           Sys.sleep(0.3)
                           x+1
                          }
            )

identical(out1,out2)
like image 142
Tim P Avatar answered Sep 28 '22 05:09

Tim P


You can use a variable outside of the apply call to keep track of the row index and pass the row names as an extra argument to your function:

idx <- 1
apply(k, 1, function(p, rn) {print(rn[idx]); idx <<- idx + 1; p + 1}, rownames(k))
like image 24
ALiX Avatar answered Sep 28 '22 05:09

ALiX