I need to reorganize data from a csv file that contains mostly repeating data. I have the data imported into R in a dataframe but I am having trouble with the following:
ID Language Author Keyword
12 eng Rob COLOR=Red
12 eng Rob SIZE=Large
12 eng Rob DD=1
15 eng John COLOR=Red
15 eng John SIZE=Medium
15 eng John DD=2
What I need to do is transform this into a row with each keyword in a separate column
ID Language Author COLOR SIZE DD
12 eng Rob Red Large 1
Any ideas?
Click in a cell, or select multiple cells that you want to split. Under Table Tools, on the Layout tab, in the Merge group, click Split Cells. Enter the number of columns or rows that you want to split the selected cells into.
Using the reshape2
package this is straightforward:
With tt
defined as in Gary's answer
library("reshape2")
tt <- cbind(tt, colsplit(tt$Keyword, "=", c("Name", "Value")))
tt_new <- dcast(tt, ID + Language + Author ~ Name, value.var="Value")
which gives
> tt_new
ID Language Author COLOR DD SIZE
1 12 eng Rob Red 1 Large
2 15 eng John Red 2 Medium
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With