I have a data frame with 1000 rows and I want to perform some operation on it with 100 rows at a time. So, I am trying to find out how would I use a counter increment on the number of rows and select 100 rows at a time like 1 to 100, then 101 to 200... uptil 1000 and perform operation on each subset using a for loop. Can anyone please suggest what how can this be done as I could not find out a good method.
An easy way would be to create a grouping variable, then use split()
and lapply()
to do whatever operations you need to.
Your grouping can be easily created using rep()
.
Here is an example:
set.seed(1)
demo = data.frame(A = sample(300, 50, replace=TRUE),
B = rnorm(50))
demo$groups = rep(1:5, each=10)
demo.split = split(demo, demo$groups)
lapply(demo.split, colMeans)
# $`1`
# A B groups
# 165.9000000 -0.1530186 1.0000000
#
# $`2`
# A B groups
# 168.2000000 0.1141589 2.0000000
#
# $`3`
# A B groups
# 126.0000000 0.1625241 3.0000000
#
# $`4`
# A B groups
# 159.4000000 0.3340555 4.0000000
#
# $`5`
# A B groups
# 181.8000000 0.0363812 5.0000000
If you prefer to not add the groups to your source data.frame
, you can achieve the same effect by doing the following:
groups = rep(1:5, each=10)
lapply(split(demo, groups), colMeans)
Of course, replace colMeans
with whatever function you want.
Using your example of a data.frame
with 1000 rows, your rep()
statement should be something like:
rep(1:10, each=100)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With