I have a regex expression that traverses a string and pulls out 40 values, it looks sort if like the query below, but much larger and more complicated
est(.*)/test>test>(.*)<test><test>(.*)test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test>
My question is how do I use these expressions with the replace command when the number exceeds 9. It seems as if whenever I use \10
it returns the value for \1
and then appends a 0
to the end.
Any help would be much appreciated thanks :)
Also I am using UEStudio, but if a different program does it better then no biggie :)
As pointed out by psycho brm: Use $10 instead of \10 I am using notepad++ and it works beautifull.
Try using named groups; so instead of the tenth:
(.*)
use:
(?<group10>.*)
and then use the following replace string:
${group10}
(That's of course in the absence of a better solution using looping, and remember that there might be different regex syntax flavours depending on your environment.)
Most of the simple Regex engines used by editors aren't equipped to handle more than 10 matching groups; it doesn't seem like UltraEdit can. I just tried Notepad++ and it won't even match a regex with 10 groups.
Your best bet, I think, is to write something fast in a quick language with a decent regex parser. but that wouldn't answer the question as asked
Here's something in Python:
import re
pattern = re.compile('(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)')
with open('input.txt', 'r') as f:
for line in f:
m = pattern.match(line)
print m.groups()
Note that Python allows backreferences such as \20
: in order to have a backreference to group 2 followed by a literal 0, you need to use \g<2>0
, which is unambiguous.
Edit: Most flavors of regex, and editors which include a regex engine, should follow the replace syntax as follows:
abcdefghijklmnop
search: (.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(?<name>.)(.)
note: 1 2 3 4 5 6 7 8 9 10 11 12 13
value: a b c d e f g h i j k l m
replace result:
\11 k1 i.e.: match 1, then the character "1"
${12} l most should support this
${name} l few support named references, but use them where you can.
Named references are usually only possible in very specific flavor of regex libraries, test your tool to know for sure.
put a $ in front of the double digit subgroup: e.g. \1\2\3\4\5\6\7\8\9$10 It worked for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With