Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R (data.table) group data by custom range (for example, -18, 18-25, ..., 65+)

I can't find a solution in R (using data.table) to group data by a custom range (for example, -18, 18-25, ..., 65+) not by a single value.

What I'm using right now:

DT[,list(M_Savings=mean(Savings), M_Term=mean(Term)), by=Age] [order (Age)]

This gives me the following result:

    Age     M_Savings   M_Term
1:  18      6500        5.5 
2:  19      7000        6.2 
3:  20      7200        5.8
...
50: 68      4000        4.2 

Desirable result:

    Age     M_Savings   M_Term
1:  18-25   7450        5.5 
2:  25-30   8320        6.2 
...
50: 65+     3862        4.3 

I Hope that my explanation is clear enough. Will appreciate any kind of help.

like image 784
Itanium Avatar asked Nov 24 '14 14:11

Itanium


1 Answers

@jdharrison is right: cut(...) is the way to go.

library(data.table)
# create sample - you have this already
set.seed(1)   # for reproducibility
DT <- data.table(age=sample(15:70,1000,replace=TRUE),
                 value=rpois(1000,10))

# you start here...
breaks <- c(0,18,25,35,45,65,Inf)
DT[,list(mean=mean(value)),by=list(age=cut(age,breaks=breaks))][order(age)]
#         age      mean
# 1:   (0,18] 10.000000
# 2:  (18,25]  9.579365
# 3:  (25,35] 10.158192
# 4:  (35,45]  9.775510
# 5:  (45,65]  9.969697
# 6: (65,Inf] 10.141414
like image 83
jlhoward Avatar answered Nov 15 '22 00:11

jlhoward