I would like to convert my dataframe into a matrix that expands a single factor column into multiple ones and assigns a 1
/0
depending on the factor. For example
C1 C2 C3
A 3 5
B 3 4
A 1 1
Should turn into something like
C1_A C1_B C2 C3
1 0 3 5
0 1 3 4
1 0 1 1
How can I do this in R? I tried data.matrix
, as.matrix
which did not return what I wanted. They assign an "integer" value to a single factor column, there is no expansion.
Assuming dat
is your data frame:
cbind(dat, model.matrix( ~ 0 + C1, dat))
C1 C2 C3 C1A C1B
1 A 3 5 1 0
2 B 3 4 0 1
3 A 1 1 1 0
This solution works with any number of factor levels and without manually specifying column names.
If you want to exclude the column C1
, you could use this command:
cbind(dat[-1], model.matrix( ~ 0 + C1, dat))
Let's call your data.frame df
:
library(reshape2)
dcast(df,C2*C3~C1,fill=0,length)
C2 C3 A B
1 1 1 1 0
2 3 4 0 1
3 3 5 1 0
dat <- read.table(text =' C1 C2 C3
A 3 5
B 3 4
A 1 1',header=T)
Using transform
transform(dat,C1_A =ifelse(C1=='A',1,0),C1_B =ifelse(C1=='B',1,0))[,-1]
C2 C3 C1_A C1_B
1 3 5 1 0
2 3 4 0 1
3 1 1 1 0
Or to get more flexbility , with within
within(dat,{
C1_A =ifelse(C1=='A',1,0)
C1_B =ifelse(C1=='B',1,0)})
C1 C2 C3 C1_B C1_A
1 A 3 5 0 1
2 B 3 4 1 0
3 A 1 1 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With