Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

vba positive lookahead is too greedy

Tags:

regex

vba

I'm using Access VBA to parse a string with regex. Here's my regex function:

Function regexSearch(pattern As String, source As String) As String

Dim re As RegExp
Dim matches As MatchCollection
Dim match As match


Set re = New RegExp
re.IgnoreCase = True

re.pattern = pattern
Set matches = re.Execute(source)


    If matches.Count > 0 Then
        regexSearch = matches(0).Value
    Else
        regexSearch = ""
    End If


End Function

When I test it with:

regexSearch("^.+(?=[ _-]+mp)", "153 - MP 13.61 to MP 17.65")

I'm expecting to get:

153

because the only characters between this and the first instance of 'MP' are the ones in the class specified in the lookahead.

but my actual return value is:

153 - MP 13.61 to

Why is it capturing up to the second 'MP'?

like image 793
sigil Avatar asked Sep 28 '11 19:09

sigil


1 Answers

Because .+ is greedy by default. The .+ gobbles up every character until it encounters a line break char, or the end-of-input. When that happens, it backtracks to the last MP (the second one in your case).

What you want is to match ungreedy. This can be done by placing a ? after .+:

regexSearch("^.+?(?=[ _-]+MP)", "153 - MP 13.61 to MP 17.65")
like image 87
Bart Kiers Avatar answered Oct 12 '22 23:10

Bart Kiers