Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R frequency table based on presence / absence samples

Tags:

r

I wasn´t quite sure how to search for the topic I´m interested in, so I apologize in advance if this question has already been asked. Questions related to frequency table didn´t solve my doubt.

I have the following df, where 1 indicates a positive results and 2 a negative ones:

d1 <- data.frame( Household = c(1:5), State = c("AL","AL","AL","MI","MI"), Electricity = c(1,1,1,2,2),
Fuelwood = c(2,2,1,1,1))

I want to produce a frequency table where I can identify the percentage of people using Eletricity, Fuelwood and Electricity+Fuelwood, such as df2:

d2 <- data.frame (State = c("AL", "MI"), Electricity = c(66.6,0), Fuelwood = c(0,100), ElectricityANDFuelwood = c(33.3,0))

Please consider that my real df has approx. 42 k households, 5 energy sources and 27 states.

like image 979
Gil33 Avatar asked Dec 03 '25 14:12

Gil33


1 Answers

We can look for rows in d1 where Electricity and Fuelwood are positive (1). Using that logical index, we can change the values in Electricity and Fuelwood rows that are both positive to negative or 2. Then, create an additional column ElecticityANDFuelwood using the index that was created. Change from wide to long form using melt, subset only the two columns State and variable, use table and prop.table to calculate the frequency and relative frequency.

indx <- with(d1, Electricity==1 & Fuelwood==1)

d1[indx,3:4] <- 2
dT <- transform(d1, ElectricityANDFuelwood= (indx)+0)[-1]

library(reshape2)
dT1 <- subset(melt(dT, id.var='State'), value==1, select=1:2)
round(100*prop.table(table(dT1), margin=1),2)
 #      variable
#State Electricity Fuelwood ElectricityANDFuelwood
#  AL       66.67     0.00                  33.33
#  MI        0.00   100.00                   0.00

Or a data.table solution contributed by @David Arenburg

library(data.table)
d2 <- as.data.table(d1[-1])[, ElectricityANDFuelwood := 
             (Electricity == 1 & Fuelwood == 1)]
d2[(ElectricityANDFuelwood), (2:3) := 2]
d2[, lapply(.SD, function(x) 100*sum(x == 1)/.N), by = State]  
#   State Electricity Fuelwood ElectricityANDFuelwood
#1:    AL    66.66667        0               33.33333
#2:    MI     0.00000      100                0.00000
like image 69
akrun Avatar answered Dec 05 '25 06:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!