Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

summarize groups into intervals using dplyr

Tags:

r

dplyr

H, I have a data frame like this:

d <- data.frame(v1=seq(0,9.9,0.1),
            v2=rnorm(100),
            v3=rnorm(100))

> head(d)
   v1          v2         v3
1 0.0 -0.01431916 -0.5005415
2 0.1 -1.01575590  1.5307473
3 0.2  1.00081065 -0.1730830
4 0.3 -1.20697918  0.5105118
5 0.4 -2.16698578 -1.0120544
6 0.5  0.33886508  0.4797016

I now want a new data frame that summarizes all values in the intervals 0-0.99, 1-1.99, 2-2.99, 3-3.99,.... by the mean for example

like this

start end mean.v2 mean.v3
    0   1     0.2     0.1
    1   2     0.5     0.4

and so on

thanks

Update I should add that in my real data set the observations in each interval are of different lengths and they don't always start at zero or end at 10

like image 410
spore234 Avatar asked Mar 22 '16 15:03

spore234


1 Answers

here is one way using cut() as suggested by @akrun:

d %>% mutate( ints = cut(v1 ,breaks = 11)) %>% 
   group_by(ints) %>% 
   summarise( mean.v2 = mean(v2) , mean.v3 = mean(v3) )
like image 107
David Heckmann Avatar answered Sep 24 '22 13:09

David Heckmann