Using R to parse and return text in parenthesis

Question

Let's say I have a string:

x <- "This is a string (Yay, string!)"

I'd like to parse the string and return "Yay, string!"

How do I do that?

I tried a bunch of grep/grepl/gsub/sub/etc but couldn't find the right combination of regex or arguments. Sigh. I need to work on the regex skills.

Andrie · Accepted Answer

Here are two ways of doing it:

One: Find the string you want, and replace the entire string with the bit that was found. (Known as back referencing)

gsub(".*$(.*)$.*", "\1", x)
[1] "Yay, string!"

This works because:

You use a backreference \1 to refer to the matched string in the parentheses (.*)
Since you want to exclude the parentheses in the actual string, you need to escape these with $ and $.

Two: Replace all the bits you don't want with empty strings:

gsub(".*$|$.*", "", x)
[1] "Yay, string!"

This works because the | acts similar to OR.

Josh O'Brien · Answer

Also, if some of your strings might contain several parenthesized substrings, all of which you want to extract, use the regex power-tools gregexpr() and regmatches():

x <- "This is (a) string (Yay, string!)" 
pat <- "(?<=$)([^()]*)(?=$)"
regmatches(x, gregexpr(pat, x, perl=TRUE))
# [[1]]
# [1] "a"            "Yay, string!"

Using R to parse and return text in parenthesis

Tags:

r

Brandon Bertelsen

2 Answers

Andrie

Josh O'Brien

Recent Activity

Donate For Us

Using R to parse and return text in parenthesis

Tags:

r

Brandon Bertelsen

2 Answers

Andrie

Josh O'Brien

Related questions

Recent Activity

Donate For Us