Possible Duplicate:
R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate vs.
I'm using R and would love some help with a problem I'm having:
I have a dataframe (df
) with a column ID and a column Emotion. Each value in ID corresponds with 40-300 values in Emotion (so it's not a set number). I need to calculate the mean of all i's in Emotion
for each j in ID
. So this is what the data looks like
df$ID = (1, 1, 1, 1, 2, 2, 3)
df$Emotion = (2, 4, 6, 4, 1, 1, 8)
so the vector of means should look like this: (4, 1, 8)
Any help would be greatly appreciated!
You can use aggregate
ID = c(1, 1, 1, 1, 2, 2, 3)
Emotion = c(2, 4, 6, 4, 1, 1, 8)
df <- data.frame(ID, Emotion)
aggregate(.~ID, data=df, mean)
ID Emotion
1 1 4
2 2 1
3 3 8
sapply
could also be useful (this other solution will give you a vector)
sapply(split(df$Emotion, df$ID), mean)
1 2 3
4 1 8
There are a lot of ways to do it including ddply
from plyr package, data.table package, other combinations of split
and lapply
, dcast
from reshape2 package. See this question for further solutions.
This is precisely the job tapply
was designed to do.
tapply(df$ID , df$Emotion, mean)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With