Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search and replace with Regex with replace variables

Tags:

regex

So my company is using a 3rd party for their mobile sites and we have a console to update some of the code and control stuff through them. One of the things is a search and replace feature that can update the code for the site. The only this is, it uses a lot of complex regex code and I cant seem to find a good tutorial on the complex stuff. So here is the example he gave me that sticks grabs the paragraph tag and puts it in the link

Search

(#d6d6d4.+?>.+?<p><a.+?>.+?)</a>(.+?)</td>

Replace With

$1$2</a></td>

What is the $1 and $2 representing? I know it probably has something to do with one of the .+? but I am unsure which one. If anyone knows please help me. I have added the code down below with numbers next to the regex variables

(#d6d6d4.+?**[1]**>.+?**[2]**<p><a.+?**[3]**>.+?**[4]**)</a>(.+?**[5]**)</td>
like image 552
user1566783 Avatar asked Dec 27 '12 18:12

user1566783


1 Answers

The $1 and $2 represent the text in capturing groups in the regex. Capturing groups are what is inside parentheses.

 (        // start first capture group
 #d6d6d4  // match #d6d6d4
 .+?>     // any character, non-greedy, up to '>'
 .+?<p>   // any character, non-greedy, up to <p>
 <a.+?>   // an <a..> tag, consuming everything up to '>'
 .+?      // all characters from <a> to </a>
 )        // close the first capture group before the '</a>'
 </a>     // literal '</a>' 
 (        // start second capture group
 .+?      // match all, non-greedy up to '</td>'
 )        // close capture group before '</td>'
 </td>    // literal '</td>'

So you if you have this string: <td color=#d6d6d4 foo=bar>Hello, world<p><a href=http://foo.com>foo link</a>some more text</td>

$1 matches: #d6d6d4 foo=bar>Hello, world<p><a href=http://foo.com>foo link $2 matches: some more text

So the string is transformed into: <td color=#d6d6d4 foo=bar>Hello, world<p><a href=http://foo.com>foo linksome more text</a></td>

Which basically means the </a> tag is moved after some more text (or just before the </td> if you prefer)

like image 145
alan Avatar answered Oct 05 '22 08:10

alan