I'm using Access VBA to parse a string with regex. Here's my regex function:
Function regexSearch(pattern As String, source As String) As String
Dim re As RegExp
Dim matches As MatchCollection
Dim match As match
Set re = New RegExp
re.IgnoreCase = True
re.pattern = pattern
Set matches = re.Execute(source)
If matches.Count > 0 Then
regexSearch = matches(0).Value
Else
regexSearch = ""
End If
End Function
When I test it with:
regexSearch("^.+(?=[ _-]+mp)", "153 - MP 13.61 to MP 17.65")
I'm expecting to get:
153
because the only characters between this and the first instance of 'MP' are the ones in the class specified in the lookahead.
but my actual return value is:
153 - MP 13.61 to
Why is it capturing up to the second 'MP'?
Because .+
is greedy by default. The .+
gobbles up every character until it encounters a line break char, or the end-of-input. When that happens, it backtracks to the last MP
(the second one in your case).
What you want is to match ungreedy. This can be done by placing a ?
after .+
:
regexSearch("^.+?(?=[ _-]+MP)", "153 - MP 13.61 to MP 17.65")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With