I want to search for a regex match in a larger string from a certain position onwards, and without using string slices.
My background is that I want to search through a string iteratively for matches of various regex's. A natural solution in Python would be keeping track of the current position within the string and using e.g.
re.match(regex, largeString[pos:])
in a loop. But for really large strings (~ 1MB) string slicing as in largeString[pos:]
becomes expensive. I'm looking for a way to get around that.
Side note: Funnily, in a niche of the Python documentation, it talks about an optional pos
parameter to the match function (which would be exactly what I want), which is not to be found with the functions themselves :-).
So, yes, regular expressions really only apply to strings. If you want a more complicated FSM, then it's possible to write one, but not using your local regex engine.
REGEXP_SUBSTR extends the functionality of the SUBSTR function by letting you search a string for a regular expression pattern. It is also similar to REGEXP_INSTR , but instead of returning the position of the substring, it returns the substring itself.
\f stands for form feed, which is a special character used to instruct the printer to start a new page.
The variants with pos and endpos parameters only exist as members of regular expression objects. Try this:
import re
pattern = re.compile("match here")
input = "don't match here, but do match here"
start = input.find(",")
print pattern.search(input, start).span()
... outputs (25, 35)
The pos
keyword is only available in the method versions. For example,
re.match("e+", "eee3", pos=1)
is invalid, but
pattern = re.compile("e+")
pattern.match("eee3", pos=1)
works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With