Suppose I have data in an R table which looks like this:
Id Name Price sales Profit Month Category Mode
1 A 2 5 8 1 X K
1 A 2 6 9 2 X K
1 A 2 5 8 3 X K
1 B 2 4 6 1 Y L
1 B 2 3 4 2 Y L
1 B 2 5 7 3 Y L
2 C 2 5 11 1 X M
2 C 2 5 11 2 X L
2 C 2 5 11 3 X K
2 D 2 8 10 1 Y M
2 D 2 8 10 2 Y K
2 D 2 5 7 3 Y K
3 E 2 5 9 1 Y M
3 E 2 5 9 2 Y L
3 E 2 5 9 3 Y M
3 F 2 4 7 1 Z M
3 F 2 5 8 2 Z L
3 F 2 5 8 3 Z M
If I use the table function on this data like:
table(df$Category, df$Mode)
It will show me under each mode which category has how many observations. It's like counting the number of items in each category under each mode.
But what if I want the table to show under each Category which Mode earned how much Profit (sum or mean) and not the total count?
Is there any way to do this with the table function or another function in R?
You can sum and count this way: tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=sum) or tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=NROW) (notice "sum" is lowercase and "NROW" is all caps).
sum() function in R Programming Language return the addition of the values passed as arguments to the function.
We can calculate the sum of multiple columns by using rowSums() and c() Function. we simply have to pass the name of the columns.
We can use xtabs from base R. By default, the xtabs gets the sum
xtabs(Profit~Category+Mode, df)
# Mode
#Category K L M
# X 36 11 11
# Y 17 26 28
# Z 0 8 15
Or another base R option that is more flexible to apply different FUN is tapply.
with(df, tapply(Profit, list(Category, Mode), FUN=sum))
# K L M
#X 36 11 11
#Y 17 26 28
#Z NA 8 15
Or we can use dcast to convert from 'long' to 'wide' format. It is more flexible as we can specify the fun.aggregate to sum, mean, median etc.
library(reshape2)
dcast(df, Category~Mode, value.var='Profit', sum)
# Category K L M
#1 X 36 11 11
#2 Y 17 26 28
#3 Z 0 8 15
If you need it in the 'long' format, here is one option with data.table. We convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'Category' and 'Mode', we get the sum of 'Profit'.
library(data.table)
setDT(df)[, list(Profit= sum(Profit)) , by = .(Category, Mode)]
Another possibility consists in using the aggregate() function:
profit_dat <- aggregate(Profit ~ Category + Mode, data=df, sum)
#> profit_dat
# Category Mode Profit
#1 X K 36
#2 Y K 17
#3 X L 11
#4 Y L 26
#5 Z L 8
#6 X M 11
#7 Y M 28
#8 Z M 15
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With