Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot: Plotting cutout of CDF while maintaining nomalization according to whole data set

Tags:

r

ggplot2

I have a data frame of the following structure:

   x                   series
11.1     "without restraints"
 9.8     "without restraints"
 7.0             "restraints"
 ...

I want to plot the cumulative distribution function of the data grouped by the series. At general it works fine with the command

ggplot(data = df, aes(x = x, col = series)) + stat_ecdf(geom = "smooth") + scale_x_continuous(limits=c(min_x, max_x))

The x values range from 3.7 to around 20. If I set the limits to 3 and 25 the output looks like http://i40.tinypic.com/2crm5xc.jpg But if I set the limits to 3 and 10 the output is http://i42.tinypic.com/24viudg.jpg and the fraction/density is now calculated according to the data set in the range 3 to 10. Is there a way that I plot it with the scale of the whole data set, so that the density is given relative to the complete data set (it should therefore be around 0.13 at x value of 10).

Thanks for any help.

like image 609
Axel Fischer Avatar asked Mar 22 '23 13:03

Axel Fischer


1 Answers

You can use coord_cartesian:

+ coord_cartesian(xlim = c(3, 10))

In contrast to the limits specified in scale_x_continuous, coord_cartesian does use the whole dataset.

From ?coord_cartesian:

Setting limits on the coordinate system will zoom the plot (like you're looking at it with a magnifying glass), and will not change the underlying data like setting limits on a scale will.

The whole code:

ggplot(data = df, aes(x = x, col = series)) + 
 stat_ecdf(geom = "smooth") + 
 coord_cartesian(xlim = c(min_x, max_x))
like image 86
Sven Hohenstein Avatar answered Apr 05 '23 22:04

Sven Hohenstein