Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data with two delimiters in R

Tags:

dataframe

r

I have a text file that has more than one delimiter. This a sample of the data:

12 ->3 4 5
14->2 1
1->3 5 6

I wonder if there is a simple way to obtain the data in the following format:

12 3
12 4
12 5
14 2
14 1
 1 3
 1 5
 1 6
like image 310
Camilla Avatar asked Dec 14 '22 09:12

Camilla


1 Answers

I was trying to reproduce your situation using cat and hope it is what you really have there. So let's say this is your file

cat("12 ->3 4 5
     14->2 1
     1->3 5 6", 
    file = "test.txt")

Using data.table, I'm reading it quickly by specifying some wrong separator so the result will be a single column data set

library(data.table)
dt <- fread("test.txt", 
            sep = ",", 
            header = FALSE)

Next step is a double split, first separating the numbers on both sides of the arrow (->), and then splitting by group

dt[, tstrsplit(V1, "\\s*->\\s*", type.convert = TRUE)
   ][, strsplit(V2, "\\s+"), by = .(indx = V1)]
#    indx V1
# 1:   12  3
# 2:   12  4
# 3:   12  5
# 4:   14  2
# 5:   14  1
# 6:    1  3
# 7:    1  5
# 8:    1  6
like image 158
David Arenburg Avatar answered Jan 01 '23 17:01

David Arenburg