Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex expression to back reference more than 9 values in a replace

I have a regex expression that traverses a string and pulls out 40 values, it looks sort if like the query below, but much larger and more complicated

est(.*)/test>test>(.*)<test><test>(.*)test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test>

My question is how do I use these expressions with the replace command when the number exceeds 9. It seems as if whenever I use \10 it returns the value for \1 and then appends a 0 to the end.

Any help would be much appreciated thanks :)

Also I am using UEStudio, but if a different program does it better then no biggie :)

like image 937
Dustin Gamester Avatar asked Jul 21 '10 22:07

Dustin Gamester


4 Answers

As pointed out by psycho brm: Use $10 instead of \10 I am using notepad++ and it works beautifull.

like image 187
Sandokas Avatar answered Nov 18 '22 13:11

Sandokas


Try using named groups; so instead of the tenth:

(.*)

use:

(?<group10>.*)

and then use the following replace string:

${group10}

(That's of course in the absence of a better solution using looping, and remember that there might be different regex syntax flavours depending on your environment.)

like image 21
Andrew Jens Avatar answered Oct 21 '22 21:10

Andrew Jens


Most of the simple Regex engines used by editors aren't equipped to handle more than 10 matching groups; it doesn't seem like UltraEdit can. I just tried Notepad++ and it won't even match a regex with 10 groups.

Your best bet, I think, is to write something fast in a quick language with a decent regex parser. but that wouldn't answer the question as asked

Here's something in Python:

import re

pattern = re.compile('(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)')
with open('input.txt', 'r') as f:
    for line in f:
        m = pattern.match(line)
        print m.groups()

Note that Python allows backreferences such as \20: in order to have a backreference to group 2 followed by a literal 0, you need to use \g<2>0, which is unambiguous.

Edit: Most flavors of regex, and editors which include a regex engine, should follow the replace syntax as follows:

abcdefghijklmnop
search: (.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(?<name>.)(.)
note:    1  2  3  4  5  6  7  8  9  10 11 12 13
value:   a  b  c  d  e  f  g  h  i  j  k  l  m
replace result:
    \11      k1      i.e.: match 1, then the character "1"
    ${12}    l       most should support this
    ${name}  l       few support named references, but use them where you can.

Named references are usually only possible in very specific flavor of regex libraries, test your tool to know for sure.

like image 5
Chris B. Avatar answered Nov 18 '22 15:11

Chris B.


put a $ in front of the double digit subgroup: e.g. \1\2\3\4\5\6\7\8\9$10 It worked for me.

like image 3
pepita96 Avatar answered Nov 18 '22 14:11

pepita96