I'm using a regular expression in Excel VBA to parse the results of a swim meet. The code reads a row of text that was copied from a PDF and outputs the important data into individual cells. Since the format of the string varies throughout the source PDF, the regular expression is quite complicated. Still, I'm able to parse 95% of the data at this point.
Some of the rows that are not being parsed are confusing me, though. VBA is clearly not able to find a match with the regular expression, but when I copy the exact same regex and string into this website, JavaScript is able to find a match without a problem. Is there something different in the way VBA and JavaScript handle regular expressions that might account for this?
Here's the string that VBA refuses to match:
12. NUNEZ CHENG, Walter 74 Club Tennis Las Terr 3:44.57 123
Here's the function I'm using in Excel (mostly successfully):
Function singleLineResults(SourceString As String) As Variant
Dim cSubmatches As Variant
Dim collectionArray(11) As String
Dim cnt As Integer
Dim oMatches As MatchCollection
With New RegExp
.MultiLine = MultiLine
.IgnoreCase = IgnoreCase
.Global = False
'1. JAROSOVA, Lenka 27 Swimmpower Prague 2:26.65 605 34.45 37.70 37.79 36.71
.Pattern = "(\d*)\.?\s?([^,]+),\s([^\d]+)\s?(\d+)\s((?:[A-Z]{3})?)\s?((?:(?!\d\:\d).)*)\s?((?:\d+:)?\d+\.\d+)(?:\s(\d+))?(?:\s((?:\d+:)?\d+.\d+))?(?:\s((?:\d+:)?\d+.\d+))?(?:\s((?:\d+:)?\d+.\d+))?(?:\s((?:\d+:)?\d+.\d+))?(?:Splash Meet Manager 11, Build \d{5} Registered to [\w\s]+ 2014-08-\d+ \d+:\d+ - Page \d+)?$"
Set oMatches = .Execute(SourceString)
If oMatches.Count > 0 Then
For Each submatch In oMatches(0).SubMatches
collectionArray(cnt) = submatch '.Value
cnt = cnt + 1
Next
Else
singleLineResults = Null
End If
End With
singleLineResults = collectionArray()
End Function
RegEx stands for “Regular Expression” in VBA Excel and is a sequence of characters that defines the search pattern for finding a specific pattern of characters in a string of values. So, in a simple word, we can create a regular expression pattern and use it to search for the string of that pattern.
In JavaScript, you can write RegExp patterns using simple patterns, special characters, and flags. In this section, we'll explore the different ways to write regular expressions while focusing on simple patterns, special characters, and flags.
How to start using RegEx. To start using RegEx in VBA, you need to select and activate the RegEx object reference library from the references list. Start by going to the VBA IDE, select Tools –> References, then select Microsoft VBScript Regular Expressions 5.5, then click OK.
Could you add more examples to what actually matches? E.g. the surrounding lines that matches, and better yet, examples that are not supposed to match if any?
I've tried "cleaning" up a bit in the regex, removing groups that are not used to match that particular line, to make the error more obvious, and changed how one of the groups works, which might actually fix the issue:
(\d*)
\.?\s?
([^,]+)
,\s
([^\d]+)
\s?
(\d+)
\s
(
(?:[A-Z]{3})?
)
\s?
(
# OLD SOLUTION
# (?:
# (?!\d\:\d)
# .
# )*
# NEW SOLUTION
.*?
)
\s?
(
(?:\d+:)?
\d+\.\d+
)
(?:
\s
(\d+)
)?
$
See example on regex101.
The group that puzzles me the most, however, is this one:
(?:[A-Z]{3})?
Why the 3 character limit, when it only matches the first 3 letters in the street name?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With