Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access n-th element after string splitting

I have a string that looks like:

string <- c("A,1,some text,200", "B,2,some other text,300", "A,3,yet another one,100")

So every vector element is further divided by commas. Now I only want to extract elements at a certain place. Let's say all the elements before the first comma or all the elements after the second comma.

The following code does what I want:

sapply(strsplit(string, ","), function(x){return(x[[1]])})
# [1] "A" "B" "A"
sapply(strsplit(string, ","), function(x){return(x[[3]])})
# [1] "some text" "some other text" "yet another one"

However this code seems fairly complicated to me (given the simplicity of the question). Are there more concise options to achieve what I want?

like image 807
symbolrush Avatar asked Feb 05 '19 15:02

symbolrush


People also ask

How can I split a string into segments of N characters Python?

Python split() method is used to split the string into chunks, and it accepts one argument called separator. A separator can be any character or a symbol. If no separators are defined, then it will split the given string and whitespace will be used by default.

What does Strsplit do in R?

Strsplit(): An R Language function which is used to split the strings into substrings with split arguments. Where: X = input data file, vector or a stings. Split = Splits the strings into required formats.

How do you split a string into substrings in Python?

When you need to split a string into substrings, you can use the split() method. In the above syntax: <string> is any valid Python string, sep is the separator that you'd like to split on.


1 Answers

1) data.frame Convert to a data frame and then it is easy to pick off a column or subset of columns:

DF <- read.table(text = string, sep = ",", as.is = TRUE)

DF[[1]]
## [1] "A" "B" "A"

DF[[3]]
## [1] "some text"       "some other text" "yet another one"

DF[-1]
##   V2              V3  V4
## 1  1       some text 200
## 2  2 some other text 300
## 3  3 yet another one 100

DF[2:3]
##   V2              V3
## 1  1       some text
## 2  2 some other text
## 3  3 yet another one

2) data.table::tranpose The data.table package has a function to tranpose lists so that if stringt is the tranposed list then stringt[[3]] is the vector of third fields, say, in a similar way to (1). Even more compact is data.table's tstrsplit mentioned by @Henrik below or the same package's fread mentioned by @akrun below.

library(data.table)

stringt <- transpose(strsplit(string, ","))

# or
stringt <- tstrsplit(string, ",")

stringt[[1]]
## [1] "A" "B" "A"

stringt[[3]]
## [1] "some text"       "some other text" "yet another one"

stringt[-1]
## [[1]]
## [1] "1" "2" "3"
##
## [[2]]
## [1] "some text"       "some other text" "yet another one"
##
## [[3]]
## [1] "200" "300" "100"

stringt[2:3]
## [[1]]
## [1] "1" "2" "3"
##
## [[2]]
## [1] "some text"       "some other text" "yet another one"

purrr also has a transpose function but

library(purrr)
transpose(strsplit(string, ","))

produces a list of lists rather than a list of character vectors.

like image 122
G. Grothendieck Avatar answered Sep 22 '22 02:09

G. Grothendieck