elegant way to loop over chunks with remainder in r?

Question

I'm looking for some way to iterate over chunks in R, but right now I'm having to add an additional statement at the end to capture the remainder if the number of items does not divide evenly into chunksize. For example:

for (i in 1:(nrow(dataframe)/chunksize)){
  (do something with chunk)
}

remainder <- nrow(dataframe) %% chunksize
(do something with dataframe[(length(dataframe)-remainder):length(dataframe),])

Is there a more elegant way to do this? I'm assuming this type of operation is done very often in other code.

hrbrmstr · Accepted Answer

If you rly want to keep the for construct:

chunk_size <- 7
for (i in seq(1, nrow(mtcars), chunk_size)) {

  seq_size <- chunk_size
  if ((i + seq_size) > nrow(mtcars)) seq_size <- nrow(mtcars) - i + 1

  cat(i, seq_size, "
")

}

1 7 
8 7 
15 7 
22 7 
29 4

You can use that to work on the indices you need to.

Here's one w/o the if:

chunk_size <- 7
chunks <- ggplot2::cut_interval(1:nrow(mtcars), length=chunk_size, labels=FALSE)
for (i in unique(chunks)) {
  print(nrow(mtcars[which(chunks==i),]))
}

Colonel Beauvel · Answer

You can use split by taking groups of at least chuncksize rows with cumsum and modulo:

n = chuncksize
lst = split(df, cumsum((1:nrow(df)-1)%%n==0))

lapply(lst, function(df_)
{
    #some code on df_
})

Example:

df = data.frame(col1=letters[1:10])
n = 3  #you want small dataframes of 3 rows

#> split(df, cumsum(1:nrow(df)%%n==0))
#$`1`
#  col1
#1    a
#2    b
#3    c

#$`2`
#  col1
#4    d
#5    e
#6    f

#$`3`
#  col1
#7    g
#8    h
#9    i

#$`4`
#   col1
#10    j

elegant way to loop over chunks with remainder in r?

Tags:

r

Allen Wang

2 Answers

hrbrmstr

Colonel Beauvel

Recent Activity

Donate For Us

elegant way to loop over chunks with remainder in r?

Tags:

r

Allen Wang

2 Answers

hrbrmstr

Colonel Beauvel

Related questions

Recent Activity

Donate For Us