Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract vectors from strsplit list without using a loop

Tags:

r

Considering the following vector:

[1] "1-1694429" "2-1546669" "3-928598"  "4-834486"  "5-802353"  "6-659439"  "7-552850" 
"8-516804"  "9-364061" 
[10] "10-354181" "11-335154" "12-257915" "13-251310" "14-232313" "15-217628" "16-216569"   

I am trying to generate two vectors, each of them containing the values obtained by splitting each element of the vector by the delimiter "-".

I used:

f <- function(s) strsplit(s, "-")
cc<-sapply(names.reads, f)

head(cc) $1-1694429 [1] "1" "1694429"

$`2-1546669`

[1] "2"       "1546669"

I know I can access them like:

> cc[[1]][1]
[1] "1"

> cc[[1]][2]
[1] "1694429"

I would like to have two vectors , each one containing the values stored at cc[[i]][1] and cc[[i]][2]...Can I do that without using a loop? (I have over 1 million elements )

like image 788
agatha Avatar asked Jan 24 '12 23:01

agatha


3 Answers

Using mathematical.coffee's suggestion, the following code avoids loops or sapply

names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
              "6-659439",  "7-552850",  "8-516804", "9-364061", "10-354181",
              "11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
              "16-216569")

cc       <- strsplit(names.reads,'-')
part1    <- unlist(cc)[2*(1:length(names.reads))-1]
part2    <- unlist(cc)[2*(1:length(names.reads))  ]

produces

> part1
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
[16] "16"
> part2
 [1] "1694429" "1546669" "928598"  "834486"  "802353"  "659439"  "552850" 
 [8] "516804"  "364061"  "354181"  "335154"  "257915"  "251310"  "232313" 
[15] "217628"  "216569"

though it does require each original value to be in the expected format.

like image 82
Henry Avatar answered Oct 30 '22 05:10

Henry


Using sapply() (for completeness' sake):

y <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353", "6-659439", "7-552850", "8-516804", "9-364061", "10-354181", "11-335154", "12-257915", "13-251310", "14-232313", "15-217628", "16-216569")

As @Bird pointed out in the comments, the USE.NAMES parameter can be used to avoid names in the resulting vector.

x <- sapply(y, function(x) strsplit(x, "-")[[1]], USE.NAMES=FALSE)

a <- x[1,]

b <- x[2,]

like image 10
pedrostrusso Avatar answered Oct 30 '22 06:10

pedrostrusso


Another approach:

names.reads <- c("1-1694429", "2-1546669", "3-928598", "4-834486", "5-802353",
              "6-659439",  "7-552850",  "8-516804", "9-364061", "10-354181",
              "11-335154", "12-257915", "13-251310", "14-232313", "15-217628",
              "16-216569")

library(reshape2)
colsplit(string=names.reads, pattern="-", names=c("Part1", "Part2"))

   Part1   Part2
1      1 1694429
2      2 1546669
3      3  928598
4      4  834486
5      5  802353
6      6  659439
7      7  552850
8      8  516804
9      9  364061
10    10  354181
11    11  335154
12    12  257915
13    13  251310
14    14  232313
15    15  217628
16    16  216569
like image 6
MYaseen208 Avatar answered Oct 30 '22 04:10

MYaseen208