Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R : Getting the sum of columns in a data.frame group by a certain column

I have a sample data.frame as below, I want to create another data.frame that contains the statistical information of that table by a certain column, how can I do that?

Like for example in the data.frame below, I like to get the sum of each column by Chart.

Sample data.frame:

Chart    Sum     Sum_Squares    Count     Average
Chart1   2           4            4         1
Chart1   3           9            3         1.5
Chart2   4           16           5         2
Chart2   5           25           2         2.5

Desired output:

Chart    Sum_sum      Sum_square_sum      Count_sum      Average_sum
Chart1      5              13                 7              2.5
Chart2      9              41                 7              4.5

I have tried below code but the return table only contains Chart and V1. sum_stat is the data.frame

  sum_stat = data.table(spc_point[,c("CHART", "SUM", "SUM_SQUARES", "COUNT", "AVERAGE")])[,c(SUM_SUM=sum(SUM), SUM_SQUARE_SUM=sum(SUM_SQUARES), COUNT_SUM=sum(COUNT), AVERAGE_SUM=sum(AVERAGE)),by=list(CHART)]

Thanks ahead

like image 506
Ianthe Avatar asked Dec 20 '22 17:12

Ianthe


1 Answers

I'm going to advocate using data.table. try this:

data<-data.table("Chart"=c("Chart1","Chart1","Chart2","Chart2"), "Sum"=c(2,3,4,5),"Sum_Squares"=c(4,9,16,25),"Count"=c(4,3,5,2),"Average"=c(1,1.5,2,2.5),key="Chart")

and then simply:

summed.data<-data[,lapply(.SD,sum),by=Chart]

find data.table package, read vignette and faq - use it :)

like image 82
Sarunas Avatar answered May 20 '23 11:05

Sarunas