I am trying to compute percentage of factors in a variable, and want to make that percentage conditional on other variable.
For example, I have data like this.
State Ideology
CO Liberal
CO Liberal
CO Liberal
CO Conservative
CO Conservative
CO Independent
DC Independent
DC Conservative
DC Liberal
I am trying to find the percentage of each Liberal, Conservative, and Independent on each state.
I tried to use ddply like
liberal_per<-ddply(data,.(State), summarize,total=table(Ideology)[1]/sum(Ideology))
But it doesn't work. How should I try to find percentage of each factor given on State?
Thank you!
Because State
comes first in the data frame, table
will use that as the row ID. Thus, you can divide the results of table
by the row sums to get ratios, or scale to percentage.
The table:
> table(x)
Ideology
State Conservative Independent Liberal
CO 2 1 3
DC 1 1 1
Using prop.table
to do the scaling, to get values per-state:
> prop.table(table(x), 1)
Ideology
State Conservative Independent Liberal
CO 0.3333333 0.1666667 0.5000000
DC 0.3333333 0.3333333 0.3333333
This is equivalent to table(x)/rowSums(table(x))
You can multiply by 100 to get percent values if needed.
You could modify your ddply
code to:
ddply(data,.(State),
function(x) with(x,
data.frame(100*round(table(Ideology)/length(Ideology),2))))
# State Ideology Freq
#1 CO Conservative 33
#2 CO Independent 17
#3 CO Liberal 50
#4 DC Conservative 33
#5 DC Independent 33
#6 DC Liberal 33
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With