Create categorical variable from mutually exclusive dummy variables [duplicate]

Question

How can I create a categorical variable from mutually exclusive dummy variables (taking values 0/1)?

Basically I am looking for the exact opposite of this solution: (https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781787124479/1/01lvl1sec22/creating-dummies-for-categorical-variables).

Would appreciate a base R solution.

For example, I have the following data:

dummy.df <- structure(c(1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 
                        0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 
                        0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L), 
            .Dim = c(10L, 4L), 
            .Dimnames = list(NULL, c("State.NJ", "State.NY", "State.TX", "State.VA")))

          State.NJ State.NY State.TX State.VA
     [1,]        1        0        0        0
     [2,]        0        1        0        0
     [3,]        1        0        0        0
     [4,]        0        0        0        1
     [5,]        0        1        0        0
     [6,]        0        0        1        0
     [7,]        1        0        0        0
     [8,]        0        0        0        1
     [9,]        0        0        1        0
    [10,]        0        0        0        1

I would like to get the following results

   state
1     NJ
2     NY
3     NJ
4     VA
5     NY
6     TX
7     NJ
8     VA
9     TX
10    VA

cat.var <- structure(list(state = structure(c(1L, 2L, 1L, 4L, 2L, 3L, 1L, 
4L, 3L, 4L), .Label = c("NJ", "NY", "TX", "VA"), class = "factor")), 
                    class = "data.frame", row.names = c(NA, -10L))

ulfelder · Accepted Answer

# toy data
df <- data.frame(a = c(1,0,0,0,0), b = c(0,1,0,1,0), c = c(0,0,1,0,1))

df$cat <- apply(df, 1, function(i) names(df)[which(i == 1)])

Result:

> df
  a b c cat
1 1 0 0   a
2 0 1 0   b
3 0 0 1   c
4 0 1 0   b
5 0 0 1   c

To generalize, you'll need to play with the df and names(df) part, but you get the drift. One option would be to make a function, e.g.,

catmaker <- function(data, varnames, catname) {

  data[,catname] <- apply(data[,varnames], 1, function(i) varnames[which(i == 1)])

  return(data)

}

newdf <- catmaker(data = df, varnames = c("a", "b", "c"), catname = "newcat")

One nice aspect of the functional approach is that it is robust to variations in the order of names in the vector of column names you feed into it. I.e., varnames = c("c", "a", "b") produces the same result as varnames = c("a", "b", "c").

P.S. You added some example data after I posted this. The function works on your example, as long as you convert dummy.df to a data frame first, e.g., catmaker(data = as.data.frame(dummy.df), varnames = colnames(dummy.df), "State") does the job.

Create categorical variable from mutually exclusive dummy variables [duplicate]

Tags:

dataframe

r

categorical-data

dummy-variable

ECII

1 Answers

ulfelder

Recent Activity

Donate For Us

Create categorical variable from mutually exclusive dummy variables [duplicate]

Tags:

dataframe

r

categorical-data

dummy-variable

ECII

1 Answers

ulfelder

Related questions

Recent Activity

Donate For Us