Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use the following R code to reproduce the following plot with the ggplot2 package?

Tags:

graph

r

ggplot2

With my data:

Row | Year | SchoolID | SchoolName | BudgetArea |PaymentPerStudent

001   2011     ABC       PS #1         Staff            12000                
002   2012     ABC       PS #1         Staff            10000
003   2011     ABC       PS #1         Lunch            22000
004   2012     ABC       PS #1         Lunch            18000 
005   2011     DEF       PS #2         Staff            80000
006   2012     DEF       PS #2         Staff            65000
007   2013     DEF       PS #2         Staff            50000
008   2011     DEF       PS #2         Lunch            23000
009   2012     DEF       PS #2         Lunch            34000
010   2013     DEF       PS #2         Lunch            28000
011   2011     GHI       PS #3         Staff             9000
012   2012     GHI       PS #3         Staff            10000
013   2013     GHI       PS #3         Staff            12000
014   2011     GHI       PS #3         Lunch            22000
015   2012     GHI       PS #3         Lunch            17000
016   2013     GHI       PS #3         Lunch            18000

I want to reproduce the following:

Desired ggplot2

Source for plot and R code

Where:

1) The Grade A.....Grade N values are replaced by the "SchoolName" values

2) The Group values (Apples, Bananas, etc.) are replaced by the "Budget Area" values (Staff, Lunch, etc.)

3) The Proportion Tasty values are replaced by "PaymentPerStudent" values.

Edit (04/09/2014): I've tried the following, with Jaap's input (see below):

    ggplot(data=Rates_2, aes(x=factor(Year), y=PaymentPerStudent/max(PaymentPerStudent), 
                         group=BudgetArea, shape=BudgetArea, color=BudgetArea)) + 
  geom_line() + 
  geom_point() +
  labs(title = "Pay rate per student by year, budget area, and school") +
  scale_x_discrete("Year") +
  scale_y_continuous("PaymentPerStudent", limits=c(0,1)) +
  facet_grid(.~SchoolID)

However, it produces the following, "condensed" plot:

Condensed Plot

I would like to find a way of splitting the schools (perhaps 9 schools per page) onto different pages of the resulting plot in order for the plots to be understandable.

Please note:

1) The data frame has just under 2,000 rows of data, with over 400 schools represented.

2) The time period, in years, is from 2001-2004.

3) The PaymentPerStudent variable ranges from 10,000 to 100,000. I would like to rescale the variable (to lie between 0 and 1) in order to accomplish my goal of producing these plots.

like image 653
ealfons1 Avatar asked Oct 02 '22 02:10

ealfons1


1 Answers

You forgot the + before facet_grid. The sample data you provided have as years 2011, 2012 & 2013. So I kept it that way. This code:

ggplot(data=Rates_2, aes(x=factor(Year), y=PaymentPerStudent/max(PaymentPerStudent), 
                         group=BudgetArea, shape=BudgetArea, color=BudgetArea)) + 
  geom_line() + 
  geom_point() +
  labs(title = "Pay rate per student by year, budget area, and school") +
  scale_x_discrete("Year") +
  scale_y_continuous("PaymentPerStudent", limits=c(0,1)) +
  facet_grid(.~SchoolID)

gives me this result: enter image description here


In order to get a seperate plot for each school, you can use:

# get the max value so you compare the plots better
max(Rates_2$PaymentPerStudent)

# split the dataframe into a list of dataframes for each school
dfs <- split(Rates_2, Rates_2$SchoolID)

# make a plot for each school
lapply(dfs, function(df) ggplot(df, aes(x=factor(Year), y=PaymentPerStudent/80000, 
                                                  group=BudgetArea, shape=BudgetArea, color=BudgetArea)) + 
         geom_line() + geom_point() +
         labs(title = "Pay rate per student by year, budget area, and school") +
         scale_x_discrete("Year") +
         scale_y_continuous("PaymentPerStudent", limits=c(0,1))
)
like image 176
Jaap Avatar answered Oct 03 '22 14:10

Jaap