Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Easier way to plot the cumulative frequency distribution in ggplot?

Tags:

r

ggplot2

I'm looking for an easier way to draw the cumulative distribution line in ggplot.

I have some data whose histogram I can immediately display with

qplot (mydata, binwidth=1); 

I found a way to do it at http://www.r-tutor.com/elementary-statistics/quantitative-data/cumulative-frequency-graph but it involves several steps and when exploring data it's time consuming.

Is there a way to do it in a more straightforward way in ggplot, similar to how trend lines and confidence intervals can be added by specifying options?

like image 459
wishihadabettername Avatar asked Aug 23 '10 00:08

wishihadabettername


People also ask

How do you plot a cumulative graph in R?

To create a cumulative sum plot in base R, we can simply use plot function. For cumulative sums inside the plot, the cumsum function needs to be used for the variable that has to be summed up with cumulation.

How do you calculate cumulative frequency in R?

The cumulative frequency table can be calculated by the frequency table, using the cumsum() method. This method returns a vector whose corresponding elements are the cumulative sums.

How do you make a cumulative frequency histogram in R?

If we want to convert our histogram to a cumulative histogram, we can use the cumsum function within the geom_histogram function as shown below: ggplot(data, aes(x)) + # Draw cumulative ggplot2 histogram geom_histogram(aes(y = cumsum(..count..)))


1 Answers

The new version of ggplot2 (0.9.2.1) has a built-in stat_ecdf() function which let's you plot cumulative distributions very easily.

qplot(rnorm(1000), stat = "ecdf", geom = "step") 

Or

df <- data.frame(x = c(rnorm(100, 0, 3), rnorm(100, 0, 10)),              g = gl(2, 100)) ggplot(df, aes(x, colour = g)) + stat_ecdf() 

Code samples from ggplot2 documentation.

like image 98
Chris Avatar answered Oct 05 '22 18:10

Chris