Suppose I have a factor (in a data.frame) which represents years:
year
1 2012
2 2012
3 2012
4 2013
5 2013
6 2013
7 2014
8 2014
9 2014
I would to to create (in this case) three new columns in the data.frame and end up with:
y2012 y2013 y2014
1 1 0 0
2 1 0 0
3 1 0 0
4 0 1 0
5 0 1 0
6 0 1 0
7 0 0 1
8 0 0 1
9 0 0 1
I can of course write a bunch of ifelse-statements, but that seems very unhandy.
We can use mtabulate
from qdapTools
library(qdapTools)
mtabulate(df1$year)
# 2012 2013 2014
#1 1 0 0
#2 1 0 0
#3 1 0 0
#4 0 1 0
#5 0 1 0
#6 0 1 0
#7 0 0 1
#8 0 0 1
#9 0 0 1
Or using some options in base R
.
model.matrix
. We convert the 'year' column to factor
class and use that in the model.matrix to get the binary columns.
model.matrix(~0+factor(year), df1)
table
. We can get the expected output using table
of the sequence of rows of df1 and the column 'year'.
table(1:nrow(df1), df1$year)
Also maybe
library(dplyr)
library(tidyr)
df %>%
mutate(id = 1L) %>%
spread(year, id, fill = 0L)
# 2012 2013 2014
# 1 1 0 0
# 2 1 0 0
# 3 1 0 0
# 4 0 1 0
# 5 0 1 0
# 6 0 1 0
# 7 0 0 1
# 8 0 0 1
# 9 0 0 1
Maybe this too (as can't think of a better way)
library(data.table)
dcast(setDT(df)[, `:=`(indx = .I, indx2 = 1L)], indx ~ year, fill = 0L)
# indx 2012 2013 2014
# 1: 1 1 0 0
# 2: 2 1 0 0
# 3: 3 1 0 0
# 4: 4 0 1 0
# 5: 5 0 1 0
# 6: 6 0 1 0
# 7: 7 0 0 1
# 8: 8 0 0 1
# 9: 9 0 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With