I have a csv that looks like
Deamon,Host,1:2:4,aaa.03
Pixe,Paradigm,1:3:5,11.us
I need to read this into a dataframe for analysis but the 3rd column in my data is separated by : and need to be read like a set or list 1.e splitted by : so that it returns (1,2,4) . Is it possible to have a columns that has a class list in R . Or How best do you think i can approach this problem.
You can use strsplit
to split a character vector into a list of components:
x <- c("1:2:4", "1:3:5")
strsplit(x, split=":")
[[1]]
[1] "1" "2" "4"
[[2]]
[1] "1" "3" "5"
As noted above, the answer will vary depending on if the number of separators in the columns are consistent or not. The answer is more straight forward if that number is consistent. Here's one answer to do that building off of Andrie's strsplit
answer:
dat <- read.csv("yourData.csv", header=FALSE, stringsAsFactors = FALSE)
#If always going to be a consistent number of separators
dat <- cbind(dat, do.call("rbind", strsplit(dat[, 3], ":")))
V1 V2 V3 V4 1 2 3
1 Deamon Host 1:02:04 aaa.03 1 02 04
2 Pixe Paradigm 1:03:05 11.us 1 03 05
Note that the above is essentially how colsplit.character
from package reshape
is implemented and may be a better option for you as it forces you to give proper names.
If the number of separators is different, then using rbind.fill
is an option from package plyr
. rbind.fill
expects data.frames which was a bit annoying, and I couldn't figure out how to get a one row data.frame without first converting to a matrix, so I imagine this can be made more efficient, but here's the basic idea:
library(plyr)
x <- c("1:2:4", "1:3:5:6:7")
rbind.fill(
lapply(
lapply(strsplit(x, ":"), matrix, nrow = 1)
, as.data.frame)
)
V1 V2 V3 V4 V5
1 1 2 4 <NA> <NA>
2 1 3 5 6 7
Which can then be cbind
ed as shown above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With