I want to re-fill a data frame according to matching values/classes of the column names and information giving in another column.
Here is a hypothetical dataframe:
> mat.data = data.frame(A = c(rep(1,2),rep(0,2)), B = c(0,rep(1,2),0) ,
+ C = rep(0,4), D = c(rep(0,3),1), cat = c(rep("A",2),"C","B"))
> mat.data
A B C D cat
1 0 0 0 A
1 1 0 0 A
0 1 0 0 C
0 0 0 1 B
I somehow managed to extract matching values by using match function (e.g. match(mat.data[,5],colnames(mat.data[1:4]))
). However, I couldn't get the output I wanted to have in a reasonable amount of time.
I want to re-fill the 0-1 values based on the a true match between the column names of the data and 5th column (So when the 5th column is A for a given row, I want "1" under the column named "A", and "0" for the others).
For a better explanation, desired output is:
> mat.data
A B C D cat
1 0 0 0 A
1 0 0 0 A
0 0 1 0 C
0 1 0 0 B
Any suggestions to make it clean and less complicated would be great.
One possible approach would be to recreate the matrix using model.matrix
but first ensure that the cat
variable has levels corresponding to the column names of the original matrix:
mat.data$cat <- factor(mat.data$cat, levels = head(names(mat.data), -1))
new.mat <- data.frame(model.matrix( ~ mat.data$cat - 1))
names(new.mat) <- levels(mat.data$cat)
new.mat
A B C D
1 1 0 0 0
2 1 0 0 0
3 0 0 1 0
4 0 1 0 0
Another option with data.table::dcast
:
library(data.table)
setDT(mat.data)
mat.data[, cat := factor(cat, levels = names(mat.data)[1:4])]
res <- dcast(mat.data, cat + seq_along(cat) ~ cat, fun.agg = length, fill = 0, drop = c(T, F))
res[, cat_1 := NULL]
# > res
# cat A B C D
# 1: A 1 0 0 0
# 2: A 1 0 0 0
# 3: B 0 1 0 0
# 4: C 0 0 1 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With