This is a follow up to this question: Concatenate previous and latter words to a word that match a condition in R I am looking for a regex which splits the string at the second space that happens after comma. Look at the example below: <pre class="prettyprint lang-r prettyprint-override"><code>vector <- c("Paulsen", "Kehr,", "Diego", "Schalper", "Sepúlveda,", "Alejandro", "Von Housen", "Kush,", "Terry") X <- paste(vector, collapse = " ") X ## this is the string I am looking to split: "Paulsen Kehr, Diego Schalper Sepúlveda, Diego Von Housen Kush, Terry" </code></pre> Second space after each comma is the criterion for my regex. So, my output will be: <pre class="prettyprint lang-r prettyprint-override"><code>"Paulsen Kehr, Diego" "Schalper Sepúlveda, Alejandro" "Von Housen Kush, Terry" </code></pre> I came up with a pattern but it is not quite working. <pre class="prettyprint lang-regex prettyprint-override"><code>[^ ]+ [^ ]+, [^ ]+( ) </code></pre> Using it with <code>strsplit</code> removes all the words instead of splitting at group-1 (i.e. <code>[^ ]+ [^ ]+, [^ ]+(group-1)</code>) only. I think I just needs to exclude the full match and match with the space afterwards only. -- regex demo <pre class="prettyprint lang-r prettyprint-override"><code>strsplit(X, "[^ ]+ [^ ]+, [^ ]+( )") # [1] "" [2] "" [3] "Von Housen Kush, Terry" </code></pre> Can anyone think of a regex for finding the second space after each comma?

You may use <pre class="prettyprint"><code>> strsplit(X, ",\\s+\\S+\\K\\s+", perl=TRUE) [[1]] [1] "Paulsen Kehr, Diego" "Schalper Sepúlveda, Alejandro" "Von Housen Kush, Terry" </code></pre> See the regex demo Details <ul> <li> <code>,</code> - a comma</li> <li> <code>\s+</code> - 1+ whitespaces</li> <li> <code>\S+</code> - 1+ non-whitespaces</li> <li> <code>\K</code> - match reset operator discarding all text matched so far</li> <li> <code>\s+</code> - 1+ whitespaces</li> </ul>

Finding second space after each comma

Tags:

regex

r

strsplit

_{This is a follow up to this question: Concatenate previous and latter words to a word that match a condition in R}

I am looking for a regex which splits the string at the second space that happens after comma. Look at the example below:

vector <- c("Paulsen", "Kehr,", "Diego", 
            "Schalper", "Sepúlveda,", "Alejandro",
             "Von Housen", "Kush,", "Terry")

X <- paste(vector, collapse = " ")
X

## this is the string I am looking to split:
"Paulsen Kehr, Diego Schalper Sepúlveda, Diego Von Housen Kush, Terry"

Second space after each comma is the criterion for my regex. So, my output will be:

"Paulsen Kehr, Diego"
"Schalper Sepúlveda, Alejandro"
"Von Housen Kush, Terry"

I came up with a pattern but it is not quite working.

[^ ]+ [^ ]+, [^ ]+( )

Using it with strsplit removes all the words instead of splitting at group-1 (i.e. [^ ]+ [^ ]+, [^ ]+(group-1)) only. I think I just needs to exclude the full match and match with the space afterwards only. -- regex demo

strsplit(X, "[^ ]+ [^ ]+, [^ ]+( )")

# [1] "" [2] "" [3] "Von Housen Kush, Terry"

Can anyone think of a regex for finding the second space after each comma?

748

asked Oct 25 '19 14:10

M--

1 Answers

You may use

> strsplit(X, ",\\s+\\S+\\K\\s+", perl=TRUE)
[[1]]
[1] "Paulsen Kehr, Diego"           "Schalper Sepúlveda, Alejandro" "Von Housen Kush, Terry"

See the regex demo

Details

, - a comma
\s+ - 1+ whitespaces
\S+ - 1+ non-whitespaces
\K - match reset operator discarding all text matched so far
\s+ - 1+ whitespaces

164

answered Oct 10 '22 04:10

Wiktor Stribiżew

Related questions
                            
                                use assign() inside purrr:walk()
                            
                                Is there a good reason to use `sort` with `index.return = TRUE` instead of `order`?
                            
                                R markdown: download a html table to an excel file
                            
                                Standardizing qualitative variables in R to perform glm's, glm.nb's and lm's
                            
                                Assigning custom week number in R
                            
                                Error in file(filename, "r", encoding = encoding) : cannot open the connection
                            
                                BNlearn R error “variable Variable1 must have at least two levels.”
                            
                                How to Split Strings based on conditions in R?
                            
                                Can't install `proj4` package because libproj and/or proj_api.h not found in standard search locations
                            
                                How can I create a "progress bar" type of graph in ggplot2 to display the percentile that a person is in?
                            
                                How to speed up a for loop in R for a nested matrix matching and colSums
                            
                                Partial sorting of a vector
                            
                                degree symbol incorrect in map axis labels
                            
                                How to use regular expressions with dplyr's select helper functions
                            
                                detect string with both AND and OR boolean operator in R
                            
                                Documenting functions in an r script
                            
                                Trouble with using Latex with R markdown
                            
                                Using Rmarkdown (pagedown) and changing Table of Contents
                            
                                R: Create empty tibble/data frame with column names coming from a vector
                            
                                Customizing how DataTables displays missing values in Shiny [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With