I'm using Python 3.3
re.sub("(.)(.)",r"\2\1\g<0>","ab") returns baab
BUT
re.sub("(.)(.)",r"\2\1\0","ab") returns ba
Is this a bug in the sub method or does the sub method not recognize \0 on purpose for some reason?
As written on this page, the \0
is interpreted as the null character (\x00
) and group number start at 1 in Python (according to the re
module documentation):
\number
Matches the contents of the group of the same number. Groups are numbered starting from 1. For example,
(.+) \1
matches 'the the' or '55 55', but not 'thethe' (note the space after the group). This special sequence can only be used to match one of the first 99 groups. If the first digit of number is 0, or number is 3 octal digits long, it will not be interpreted as a group match, but as the character with octal value number. Inside the '[' and ']' of a character class, all numeric escapes are treated as characters.
Also, according to the page previously linked, it's not a bug but a desired behaviour (this is obvious, since it's documented).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With