I have a dataframe recording how much money a costomer spend in detail like the following:
custid, value
1, 1
1, 3
1, 2
1, 5
1, 4
1, 1
2, 1
2, 10
3, 1
3, 2
3, 5
How to calcuate the charicteristics using mean,max,median,std, etc like the following? Use some apply function? And how?
custid, mean, max,min,median,std
1, ....
2,....
3,....
Find the value of n by adding the values in frequency. Find the median class. Find the lower limit of the class interval and the cumulative frequency. Apply the formula for median for grouped data: Median = l + [(n/2−c)/f] × h.
To find the mean of multiple columns based on multiple grouping columns in R data frame, we can use summarise_at function with mean function.
To find the median of all columns, we can use apply function. For example, if we have a data frame df that contains numerical columns then the median for all the columns can be calculated as apply(df,2,median).
library(dplyr)
dat%>%
group_by(custid)%>%
summarise(Mean=mean(value), Max=max(value), Min=min(value), Median=median(value), Std=sd(value))
# custid Mean Max Min Median Std
#1 1 2.666667 5 1 2.5 1.632993
#2 2 5.500000 10 1 5.5 6.363961
#3 3 2.666667 5 1 2.0 2.081666
For bigger datasets, data.table
would be faster
setDT(dat)[,list(Mean=mean(value), Max=max(value), Min=min(value), Median=as.numeric(median(value)), Std=sd(value)), by=custid]
# custid Mean Max Min Median Std
#1: 1 2.666667 5 1 2.5 1.632993
#2: 2 5.500000 10 1 5.5 6.363961
#3: 3 2.666667 5 1 2.0 2.081666
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With