Suppose I have a data frame like this:
v1 v2 v3
a 1 a
a 2 b
a 6 c
b 3 a
b 4 b
b 5 c
Where v1 is a factor, and v3 is a character. I want to apply some function to the data frame such v2 is split across v1 and then included in the data frame:
v1 v2 v3 v4 v5
a 1 a 1 NA
a 2 b 2 NA
a 6 c 6 NA
b 3 a NA 3
b 4 b NA 4
b 5 c NA 5
The solutions I have been able to work out are very convoluted. Is there an elegant way of doing this?
(Note: v3 exists because any solution needs to be able to deal with the existence of other non-numeric vectors in the data frame that should be ignored.)
1) transform / ifelse A simple approach if there are a small known number of values in v1 is to manually generate each new column:
transform(DF, a = ifelse(v1 == "a", v2, NA),
b = ifelse(v1 == "b", v2, NA))
2) tapply A more general approach would be:
cbind(DF, tapply(DF$v2, list(1:nrow(DF), DF$v1), identity))
The solutions above do not require any addon packages.
3) data.table. This solution assumes that v1 is a factor and that the rows of DF are unique (as is the case in the question):
# devtools::install_github("Rdatatable/datatable") # 1.9.3
library(data.table)
DT <- data.table(DF)
DT[, split(v2, v1), by = DT]
If the rows of DT might not be unique then (based on discussion with Arun) this would work:
DT[, c(.SD, split(v2, v1)), by = 1:nrow(DT)][, -1, with = FALSE]
Update Some improvements.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With