Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R table function: how to sum instead of counting? [duplicate]

Tags:

r

aggregate

Suppose I have data in an R table which looks like this:

Id  Name Price sales Profit Month Category Mode
1   A     2     5     8       1     X       K
1   A     2     6     9       2     X       K
1   A     2     5     8       3     X       K
1   B     2     4     6       1     Y       L
1   B     2     3     4       2     Y       L
1   B     2     5     7       3     Y       L
2   C     2     5    11       1     X       M
2   C     2     5    11       2     X       L
2   C     2     5    11       3     X       K
2   D     2     8    10       1     Y       M
2   D     2     8    10       2     Y       K
2   D     2     5    7        3     Y       K
3   E     2     5    9        1     Y       M
3   E     2     5    9        2     Y       L
3   E     2     5    9        3     Y       M
3   F     2     4    7        1     Z       M
3   F     2     5    8        2     Z       L
3   F     2     5    8        3     Z       M

If I use the table function on this data like:

table(df$Category, df$Mode)

It will show me under each mode which category has how many observations. It's like counting the number of items in each category under each mode.

But what if I want the table to show under each Category which Mode earned how much Profit (sum or mean) and not the total count?

Is there any way to do this with the table function or another function in R?

like image 449
Jay khan Avatar asked Sep 01 '15 07:09

Jay khan


People also ask

How do I sum a count in a table in R?

You can sum and count this way: tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=sum) or tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=NROW) (notice "sum" is lowercase and "NROW" is all caps).

What does the function sum () do in R?

sum() function in R Programming Language return the addition of the values passed as arguments to the function.

How do I sum a column in R?

We can calculate the sum of multiple columns by using rowSums() and c() Function. we simply have to pass the name of the columns.


2 Answers

We can use xtabs from base R. By default, the xtabs gets the sum

xtabs(Profit~Category+Mode, df)
#           Mode
#Category  K  L  M
#       X 36 11 11
#       Y 17 26 28
#       Z  0  8 15

Or another base R option that is more flexible to apply different FUN is tapply.

with(df, tapply(Profit, list(Category, Mode), FUN=sum))
#  K  L  M
#X 36 11 11
#Y 17 26 28
#Z NA  8 15

Or we can use dcast to convert from 'long' to 'wide' format. It is more flexible as we can specify the fun.aggregate to sum, mean, median etc.

library(reshape2)
dcast(df, Category~Mode, value.var='Profit', sum)
# Category  K  L  M
#1        X 36 11 11
#2        Y 17 26 28
#3        Z  0  8 15

If you need it in the 'long' format, here is one option with data.table. We convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'Category' and 'Mode', we get the sum of 'Profit'.

library(data.table)
setDT(df)[, list(Profit= sum(Profit)) , by = .(Category, Mode)]
like image 141
akrun Avatar answered Sep 22 '22 06:09

akrun


Another possibility consists in using the aggregate() function:

profit_dat <- aggregate(Profit ~ Category + Mode, data=df, sum)
#> profit_dat
#  Category Mode Profit
#1        X    K     36
#2        Y    K     17
#3        X    L     11
#4        Y    L     26
#5        Z    L      8
#6        X    M     11
#7        Y    M     28
#8        Z    M     15
like image 38
RHertel Avatar answered Sep 22 '22 06:09

RHertel