Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R regular expression: isolate a string between quotes

Tags:

regex

r

quotes

I have a string myFunction(arg1=\"hop\",arg2=TRUE). I want to isolate what is in between quotes (\"hop\" in this example)

I have tried so far with no success:

gsub(pattern="(myFunction)(\\({1}))(.*)(\\\"{1}.*\\\"{1})(.*)(\\){1})",replacement="//4",x="myFunction(arg1=\"hop\",arg2=TRUE)")

Any help by a regex guru would be welcome!

like image 785
RockScience Avatar asked Apr 08 '15 07:04

RockScience


2 Answers

Try

 sub('[^\"]+\"([^\"]+).*', '\\1', x)
 #[1] "hop"

Or

 sub('[^\"]+(\"[^\"]+.).*', '\\1', x)
 #[1] "\"hop\""

The \" is not needed as " would work too

 sub('[^"]*("[^"]*.).*', '\\1', x)
 #[1] "\"hop\""

If there are multiple matches, as @AvinashRaj mentioned in his post, sub may not be that useful. An option using stringi would be

 library(stringi)
 stri_extract_all_regex(x1, '"[^"]*"')[[1]]
 #[1] "\"hop\""  "\"hop2\""

data

 x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
 x1 <- "myFunction(arg1=\"hop\",arg2=TRUE arg3=\"hop2\", arg4=TRUE)"
like image 57
akrun Avatar answered Sep 22 '22 02:09

akrun


You could use regmatches function also. Sub or gsub only works for a particular input , for general case you must do grabing instead of removing.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> regmatches(x, gregexpr('"[^"]*"', x))[[1]]
[1] "\"hop\""

To get only the text inside quotes then pass the result of above function to a gsub function which helps to remove the quotes.

> x <- "myFunction(arg1=\"hop\",arg2=TRUE)"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"
> x <- "myFunction(arg1=\"hop\",arg2=\"TRUE\")"
> gsub('"', '', regmatches(x, gregexpr('"([^"]*)"', x))[[1]])
[1] "hop"  "TRUE"
like image 23
Avinash Raj Avatar answered Sep 25 '22 02:09

Avinash Raj