Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

forming and using Regular expressions in R

Tags:

regex

r

Som I am new to R. I was learning this concept of forming regular expressions.

i.e. something like this "(\\2.\\3)". What are these? I mean, what do these numbers and notation represents? Can anyone explain in a very layman language what does this mean? Or something like this, (\2.\4)(\2.\4), what does it mean? Thanks for any help!

like image 413
chan chong Avatar asked Oct 21 '14 19:10

chan chong


1 Answers

They are called backreferences which recall what was matched by a capturing group. A capturing group can be created by placing the characters to be grouped inside a set of parenthesis ( ). A backreference is specified as a backslash (\) in R, two backslashes (\\); followed by a digit indicating the number of the group to be recalled.

Below is an example replacing using backreferences to recall what was matched by capturing group #2 and #3 ...

x <- 'foo bar baz quz'
sub('(\\S+) (\\S+) (\\S+) (\\S+)', '(\\2.\\3)', x)
# [1] "(bar.baz)"

Note: The opening and closing parenthesis in the replacement along with the dot are literal characters.

like image 130
hwnd Avatar answered Sep 30 '22 21:09

hwnd