Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In R, how do I compute factors' percentage given on different variable? [duplicate]

Tags:

r

percentage

I am trying to compute percentage of factors in a variable, and want to make that percentage conditional on other variable.

For example, I have data like this.

State Ideology
CO    Liberal
CO    Liberal
CO    Liberal
CO    Conservative
CO    Conservative
CO    Independent
DC    Independent
DC    Conservative
DC    Liberal

I am trying to find the percentage of each Liberal, Conservative, and Independent on each state.

I tried to use ddply like

liberal_per<-ddply(data,.(State), summarize,total=table(Ideology)[1]/sum(Ideology))

But it doesn't work. How should I try to find percentage of each factor given on State?

Thank you!

like image 957
user3077008 Avatar asked Dec 20 '22 13:12

user3077008


2 Answers

Because State comes first in the data frame, table will use that as the row ID. Thus, you can divide the results of table by the row sums to get ratios, or scale to percentage.

The table:

> table(x)
     Ideology
State Conservative Independent Liberal
   CO            2           1       3
   DC            1           1       1

Using prop.table to do the scaling, to get values per-state:

> prop.table(table(x), 1)
     Ideology
State Conservative Independent   Liberal
   CO    0.3333333   0.1666667 0.5000000
   DC    0.3333333   0.3333333 0.3333333

This is equivalent to table(x)/rowSums(table(x))

You can multiply by 100 to get percent values if needed.

like image 179
Matthew Lundberg Avatar answered Apr 30 '23 19:04

Matthew Lundberg


You could modify your ddply code to:

 ddply(data,.(State), 
    function(x) with(x,
      data.frame(100*round(table(Ideology)/length(Ideology),2))))

 #    State     Ideology Freq
 #1    CO Conservative   33
 #2    CO  Independent   17
 #3    CO      Liberal   50
 #4    DC Conservative   33
 #5    DC  Independent   33
 #6    DC      Liberal   33
like image 45
akrun Avatar answered Apr 30 '23 21:04

akrun