Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explain this R regex

Tags:

regex

r

Recently here an R question was answered by mrdwab that used a regex that was pretty cool (LINK). I liked the response but can't generalize it because I don't understand what's happening (I fooled with the different numeric values being supplied but that didn't really yield anything useful). Could someone break the regex down piece by piece and explain what's happening?

x <- c("WorkerId", "pio_1_1", "pio_1_2", "pio_1_3", "pio_1_4", "pio_2_1", 
"pio_2_2", "pio_2_3", "pio_2_4")

gsub("([a-z])_([0-9])_([0-9])", "\\1_\\3\\.\\2", x)  #Explain me please

Thank you in advance.

like image 467
Tyler Rinker Avatar asked Dec 20 '22 23:12

Tyler Rinker


1 Answers

Anywhere you have a character and two numbers separated by underscores (e.g., a_1_2) the regex will select the matched character and numbers and make them available as variables. \\1, \\2, and \\3 refer to the matched arguments in the original expression:

\\1 <- a
\\2 <- 1
\\3 <- 2

The result of running gsub as you have it above is to search an expression for matches and flip the order of the numbers wherever they appear. So, for example, a_1_2 would become a_2.1.

"\\1_\\3\\.\\2"
#  a_  2  .  1
like image 65
rjz Avatar answered Dec 24 '22 01:12

rjz