I have this [nasty] regex to capture a VBA procedure signature with all the parts in a bucket:
public static string ProcedureSyntax
{
get
{
return
@"(?:(?<accessibility>Friend|Private|Public)\s)?(?:(?<kind>Sub|Function|Property\s(Get|Let|Set)))\s(?<identifier>(?:[a-zA-Z][a-zA-Z0-9_]*)|(?:\[[a-zA-Z0-9_]*\]))\((?<parameters>.*)?\)(?:\sAs\s(?<reference>(((?<library>[a-zA-Z][a-zA-Z0-9_]*))\.)?(?<identifier>([a-zA-Z][a-zA-Z0-9_]*)|\[[a-zA-Z0-9_]*\]))(?<array>\((?<size>(([0-9]+)\,?\s?)*|([0-9]+\sTo\s[0-9]+\,?\s?)+)\))?)?";
}
}
Part of it is overkill and will match illegal array syntaxes (in the context of a procedure's signature), but that's not my concern right now.
The problem is that this part:
\((?<parameters>.*)?\)
breaks when a function (or property getter) returns an array, because then the signature will look something like this:
Public Function GetSomeArray() As Variant()
Or like this:
Public Function GetSomeArray(ByVal foo As Integer) As Variant()
And that makes the function's return type completely borked, because the parameters
capture group will pick up this:
ByVal foo As Integer) As Variant(
I know why it's happening - because my regex is assuming the last closing brace is the one delimiting the parameters
capture group.
Is there a way to fix my regex to change that, without impacting performance too much?
The catch is that this is a valid signature:
Public Function DoSomething(foo As Integer, ParamArray bar()) As Variant()
I have another separate regex to handle individual parameters, and it would work great... if this one didn't get confused with array return types.
This is what I'm getting:
What I need, is a parameters
group that doesn't include the ) As Variant(
part, like it does when the return type isn't an array:
Here you go....
(?:(?<accessibility>Friend|Private|Public)\s)?(?:(?<kind>Sub|Function|Property\s(Get|Let|Set)))\s(?<identifier>(?:[a-zA-Z][a-zA-Z0-9_]*)|(?:\[[a-zA-Z0-9_]*\]))\((?<parameters>(?:\(\)|[^()])*)?\)(?:\sAs\s(?<reference>(((?<library>[a-zA-Z][a-zA-Z0-9_]*))\.)?(?<identifier1>([a-zA-Z][a-zA-Z0-9_]*)|\[[a-zA-Z0-9_]*\]))(?<array>\((?<size>(([0-9]+)\,?\s?)*|([0-9]+\sTo\s[0-9]+\,?\s?)+)\))?)?
DEMO
What are the changes made in your original regex?
I just changed this \((?<parameters>.*)?\)
part in your original regex to \((?<parameters>(?:\(\)|[^()])*)?\)
. That is, .*
in your pattern will do a greedy match upto the last )
symbol, but this (?:\(\)|[^()])*
matches ()
part or any character not of (
or )
zero or more times. so this matches the strings like foo
or foo()bar
..
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With