Is there a way using a regex to match a repeating set of characters? For example:
ABCABCABCABCABC
ABC{5}
I know that's wrong. But is there anything to match that effect?
Update:
Can you use nested capture groups? So Something like (?<cap>(ABC){5})
?
A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.
The ?! n quantifier matches any string that is not followed by a specific string n.
The Match-zero-or-more Operator ( * ) This operator repeats the smallest possible preceding regular expression as many times as necessary (including zero) to match the pattern. `*' represents this operator. For example, `o*' matches any string made up of zero or more `o' s.
Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .
Enclose the regex you want to repeat in parentheses. For instance, if you want 5 repetitions of ABC
:
(ABC){5}
Or if you want any number of repetitions (0 or more):
(ABC)*
Or one or more repetitions:
(ABC)+
edit to respond to update
Parentheses in regular expressions do two things; they group together a sequence of items in a regular expression, so that you can apply an operator to an entire sequence instead of just the last item, and they capture the contents of that group so you can extract the substring that was matched by that subexpression in the regex.
You can nest parentheses; they are counted from the first opening paren. For instance:
>>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(0) '123 ABCDEF' >>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(1) 'ABCDEF' >>> re.search('[0-9]* (ABC(...))', '123 ABCDEF 456').group(2) 'DEF'
If you would like to avoid capturing when you are grouping, you can use (?:
. This can be helpful if you don't want parentheses that you're just using to group together a sequence for the purpose of applying an operator to change the numbering of your matches. It is also faster.
>>> re.search('[0-9]* (?:ABC(...))', '123 ABCDEF 456').group(1) 'DEF'
So to answer your update, yes, you can use nested capture groups, or even avoid capturing with the inner group at all:
>>> re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(1) 'ABCABCABCABCABC' >>> re.search('((?:ABC){5})(DEF)', 'ABCABCABCABCABCDEF').group(2) 'DEF'
ABC{5} matches ABCCCCC. To match 5 ABC's, you should use (ABC){5}. Parentheses are used to group a set of characters. You can also set an interval for occurrences like (ABC){3,5} which matches ABCABCABC, ABCABCABCABC, and ABCABCABCABCABC.
(ABC){1,} means 1 or more repetition which is exactly the same as (ABC)+.
(ABC){0,} means 0 or more repetition which is exactly the same as (ABC)*.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With