Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting summary statistics

For the following data set,

Genre   Amount
Comedy  10
Drama   30
Comedy  20
Action  20
Comedy  20
Drama   20

I want to construct a ggplot2 line graph, where the x-axis is Genre and the y-axis is the sum of all amounts (conditional on the Genre).

I have tried the following:

p = ggplot(test, aes(factor(Genre), Gross)) + geom_point()
p = ggplot(test, aes(factor(Genre), Gross)) + geom_line()
p = ggplot(test, aes(factor(Genre), sum(Gross))) + geom_line()

but to no avail.

like image 923
Julio Diaz Avatar asked Mar 07 '11 08:03

Julio Diaz


1 Answers

If you don't want to compute a new data frame before plotting, you cvan use stat_summary in ggplot2. For example, if your data set looks like this :

R> df <- data.frame(Genre=c("Comedy","Drama","Action","Comedy","Drama"),
R+                  Amount=c(10,30,40,10,20))
R> df
   Genre Amount
1 Comedy     10
2  Drama     30
3 Action     40
4 Comedy     10
5  Drama     20

You can use either qplot with a stat="summary" argument :

R> qplot(Genre, Amount, data=df, stat="summary", fun.y="sum")

Or add a stat_summary to a base ggplot graphic :

R> ggplot(df, aes(x=Genre, y=Amount)) + stat_summary(fun.y="sum", geom="point")
like image 109
juba Avatar answered Sep 19 '22 02:09

juba