Suppose I have data in an R table which looks like this:
Id Name Price sales Profit Month Category Mode
1 A 2 5 8 1 X K
1 A 2 6 9 2 X K
1 A 2 5 8 3 X K
1 B 2 4 6 1 Y L
1 B 2 3 4 2 Y L
1 B 2 5 7 3 Y L
2 C 2 5 11 1 X M
2 C 2 5 11 2 X L
2 C 2 5 11 3 X K
2 D 2 8 10 1 Y M
2 D 2 8 10 2 Y K
2 D 2 5 7 3 Y K
3 E 2 5 9 1 Y M
3 E 2 5 9 2 Y L
3 E 2 5 9 3 Y M
3 F 2 4 7 1 Z M
3 F 2 5 8 2 Z L
3 F 2 5 8 3 Z M
If I use the table
function on this data like:
table(df$Category, df$Mode)
It will show me under each mode which category has how many observations. It's like counting the number of items in each category under each mode.
But what if I want the table to show under each Category
which Mode
earned how much Profit
(sum or mean) and not the total count?
Is there any way to do this with the table
function or another function in R?
You can sum and count this way: tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=sum) or tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=NROW) (notice "sum" is lowercase and "NROW" is all caps).
sum() function in R Programming Language return the addition of the values passed as arguments to the function.
We can calculate the sum of multiple columns by using rowSums() and c() Function. we simply have to pass the name of the columns.
We can use xtabs
from base R
. By default, the xtabs
gets the sum
xtabs(Profit~Category+Mode, df)
# Mode
#Category K L M
# X 36 11 11
# Y 17 26 28
# Z 0 8 15
Or another base R
option that is more flexible to apply different FUN
is tapply
.
with(df, tapply(Profit, list(Category, Mode), FUN=sum))
# K L M
#X 36 11 11
#Y 17 26 28
#Z NA 8 15
Or we can use dcast
to convert from 'long' to 'wide' format. It is more flexible as we can specify the fun.aggregate
to sum
, mean
, median
etc.
library(reshape2)
dcast(df, Category~Mode, value.var='Profit', sum)
# Category K L M
#1 X 36 11 11
#2 Y 17 26 28
#3 Z 0 8 15
If you need it in the 'long' format, here is one option with data.table
. We convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'Category' and 'Mode', we get the sum
of 'Profit'.
library(data.table)
setDT(df)[, list(Profit= sum(Profit)) , by = .(Category, Mode)]
Another possibility consists in using the aggregate()
function:
profit_dat <- aggregate(Profit ~ Category + Mode, data=df, sum)
#> profit_dat
# Category Mode Profit
#1 X K 36
#2 Y K 17
#3 X L 11
#4 Y L 26
#5 Z L 8
#6 X M 11
#7 Y M 28
#8 Z M 15
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With