Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sets in R DataFrame

Tags:

r

I have a csv that looks like

 Deamon,Host,1:2:4,aaa.03
 Pixe,Paradigm,1:3:5,11.us

I need to read this into a dataframe for analysis but the 3rd column in my data is separated by : and need to be read like a set or list 1.e splitted by : so that it returns (1,2,4) . Is it possible to have a columns that has a class list in R . Or How best do you think i can approach this problem.

like image 296
damola Avatar asked Jan 18 '23 23:01

damola


2 Answers

You can use strsplit to split a character vector into a list of components:

x <- c("1:2:4", "1:3:5")
strsplit(x, split=":")
[[1]]
[1] "1" "2" "4"

[[2]]
[1] "1" "3" "5"
like image 58
Andrie Avatar answered Jan 21 '23 13:01

Andrie


As noted above, the answer will vary depending on if the number of separators in the columns are consistent or not. The answer is more straight forward if that number is consistent. Here's one answer to do that building off of Andrie's strsplit answer:

dat <- read.csv("yourData.csv", header=FALSE, stringsAsFactors = FALSE)

#If always going to be a consistent number of separators
dat <- cbind(dat, do.call("rbind", strsplit(dat[, 3], ":")))

       V1       V2      V3     V4 1  2  3
1  Deamon     Host 1:02:04 aaa.03 1 02 04
2    Pixe Paradigm 1:03:05  11.us 1 03 05

Note that the above is essentially how colsplit.character from package reshape is implemented and may be a better option for you as it forces you to give proper names.

If the number of separators is different, then using rbind.fill is an option from package plyr. rbind.fill expects data.frames which was a bit annoying, and I couldn't figure out how to get a one row data.frame without first converting to a matrix, so I imagine this can be made more efficient, but here's the basic idea:

library(plyr)
x <- c("1:2:4", "1:3:5:6:7")
rbind.fill(
  lapply(
    lapply(strsplit(x, ":"), matrix, nrow = 1)
  , as.data.frame)
)

  V1 V2 V3   V4   V5
1  1  2  4 <NA> <NA>
2  1  3  5    6    7

Which can then be cbinded as shown above.

like image 25
Chase Avatar answered Jan 21 '23 12:01

Chase