Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cumulative frequency by factor

Tags:

r

ggplot2

I have to find out the cumulative frequency, converted to percentage, of a continuous variable by factor. For example:

data <- data.frame(n = sample(1:12),
                d = seq(10, 120, by = 10),
                Site = rep(c("FirstSite", "SecondSite"), 6), 
                Plot = rep(c("Plot1", "Plot1", "Plot2", "Plot2"), 3)
                )

data <- with(data, data[order(Site,Plot),])
data <- transform(data, G = ((pi * (d/2)^2) * n) / 10000)

data
    n   d       Site  Plot           G
1   7  10  FirstSite Plot1  0.05497787
5   9  50  FirstSite Plot1  1.76714587
9  12  90  FirstSite Plot1  7.63407015
3  10  30  FirstSite Plot2  0.70685835
7   5  70  FirstSite Plot2  1.92422550
11  1 110  FirstSite Plot2  0.95033178
2   3  20 SecondSite Plot1  0.09424778
6   8  60 SecondSite Plot1  2.26194671
10  6 100 SecondSite Plot1  4.71238898
4   4  40 SecondSite Plot2  0.50265482
8   2  80 SecondSite Plot2  1.00530965
12 11 120 SecondSite Plot2 12.44070691

I need the cumulaive frequency of column G by factors Plot~Sitein order to plot a geom_step ggplot of G against d for each plot and site.
I have achieved to compute cumulative sum of G by factor by:

data.ss <- by(data[, "G"], data[,c("Plot", "Site")], function(x) cumsum(x))
# Gtot
(data.ss.tot <- sapply(ss, max))
[1]  9.456194  3.581416  7.068583 13.948671

Now I need to express each Plot G in the range [0..1] where 1 is Gtot for each Plot. I imagine I should divide G by its Plot Gtot, then apply a new cumsum to it. How to do it?
Please note that I have to plot this cumulative frequency against d not G itself, so it is not a proper ecdf.
Thank you.

like image 216
mbask Avatar asked Jan 16 '12 22:01

mbask


People also ask

How do you find the cumulative frequency less than type?

Thus, cumulative frequency of less than type for a particular value of the variable is obtained by cumulating or adding the frequencies of all values less than that value upto the frequency that particular value, i.e., by adding its frequency to the frequencies of all the values smaller than that value.

How do you calculate cumulative frequency more than a type?

Now, more than type frequency can be calculated by subtracting all the proceeding frequencies from the sum of all the frequencies.

What is the cumulative frequency of 4?

Here, the cumulative frequency for 4 is 16+18+11+15=60.

What is the formula for cumulative frequency?

The cumulative frequency is calculated by adding each frequency from a frequency distribution table to the sum of its predecessors. The last value will always be equal to the total for all observations, since all frequencies will already have been added to the previous total.


1 Answers

I usually use ddply and transform to do this type of thing:

> data = ddply(data, c('Site', 'Plot'), transform, Gsum=cumsum(G), Gtot=sum(G))
> qplot(x=d, y=Gsum/Gtot, facets=Plot~Site, geom='step', data=data)

enter image description here

like image 116
John Colby Avatar answered Sep 19 '22 22:09

John Colby