Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

An efficient way to indicate multiple indicator variables per row?

Given an "empty" indicator dataframe:

Index    Ind_A    Ind_B
  1        0        0
  2        0        0
  3        0        0
  4        0        0

and a dataframe of values:

Index    Indicators
  1         Ind_A
  3         Ind_A
  3         Ind_B
  4         Ind_A

I want to end up with:

Index    Ind_A    Ind_B
  1        1        0
  2        0        0
  3        1        1
  4        1        0

Is there a way to do this without a for loop?

like image 276
lapolonio Avatar asked Oct 20 '22 11:10

lapolonio


1 Answers

I would do directly:

df = transform(df, Index=factor(Index, level=min(Index):max(Index)))
as.data.frame.matrix(table(df))
#  Ind_A Ind_B
#1     1     0
#2     0     0
#3     1     1
#4     1     0

Data:

df = structure(list(Index = c(1, 3, 3, 4), Indicators = c("Ind_A", 
"Ind_A", "Ind_B", "Ind_A")), .Names = c("Index", "Indicators"
), row.names = c(NA, -4L), class = "data.frame")
like image 96
Colonel Beauvel Avatar answered Oct 26 '22 23:10

Colonel Beauvel