Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proper way to loop over the length of a dataframe in R

Tags:

loops

r

After quite a bit of debugging today, to my dismay i found that:

for (i in 1:0) {
     print(i)
}

Actually prints 1 and 0 respectively in R. The problem came up when writing

for (i in 1:nrow(myframe) {
     fn(i)
}

Which i had intended to not execute at all if nrow(myframe)==0. Is the proper correction just:

if (nrow(myvect) != 0) {
    for (i in 1:nrow(myframe) {
        fn(i)
    }
}

Or is there a more proper way to do what I wanted in R?

like image 901
mt88 Avatar asked Jul 23 '14 17:07

mt88


4 Answers

You can use seq_along instead:

vec <- numeric() 
length(vec)
#[1] 0

for(i in seq_along(vec)) print(i)   # doesn't print anything

vec <- 1:5

for(i in seq_along(vec)) print(i)
#[1] 1
#[1] 2
#[1] 3
#[1] 4
#[1] 5

Edit after OP update

df <- data.frame(a = numeric(), b = numeric())
> df
#[1] a b
#<0 rows> (or row.names with length 0)

for(i in seq_len(nrow(df))) print(i)    # doesn't print anything

df <- data.frame(a = 1:3, b = 5:7)

for(i in seq_len(nrow(df))) print(i)
#[1] 1
#[1] 2
#[1] 3
like image 176
talat Avatar answered Nov 10 '22 00:11

talat


Clearly all previous answers do the job.

I like to have something like this:

rows_along <- function(df) seq(nrow(df))

and then

for(i in rows_along(df)) # do stuff

Totally idiosyncratic answer, it is just a wrapper. But I think it is more readable/intuitive.

like image 35
Abel Borges Avatar answered Nov 09 '22 23:11

Abel Borges


Regarding the edit, see the counterpart function seq_len(NROW(myframe)). This usage is exactly why you don't use 1:N in a for() loop, incase whatever value ends up replacing N is 0 or negative.

An alternative (which just hides the loop) is to do apply(myframe, 1, FUN = foo) where foo is a function containing the things you want to do to each row of myframe and will probably just be cut and paste from the body of the loop.

like image 31
Gavin Simpson Avatar answered Nov 10 '22 00:11

Gavin Simpson


For vectors there is seq_along, for DataFrames you may use seq_len

for(i in seq_len(nrow(the.table)){
    do.stuff()
}
like image 39
Boris Gorelik Avatar answered Nov 10 '22 00:11

Boris Gorelik