I am surprised to not easily find a similar question with an answer on SO. I would like to match everything in some functions. The idea is to remove the functions which are useless.
foo(some (content)) --> some (content)
So I am trying to match everything in the function call which can include parenthesis. Here is my PCRE regex:
(?<name>\w+)\s*\(\K
(?<e>
[^()]+
|
[^()]*
\((?&e)\)
[^()]*
)*
(?=\))
https://regex101.com/r/gfMAIM/1
Unfortunately it doesn't work and I don't really understand why.
Your Group e pattern does not do the right job, currently, it matches parentheses with 1 depth level as you only recursed the e pattern once. It needs to match as many (...) substrings as there are present, and thus, the subroutine pattern needs to be inside a * or + quantified group, and it can even be "simplified" to (?<e>[^()]*(?:\((?&e)\)[^()]*)*).
Note that your Group e pattern is equal to (?<e>[^()]+|\((?&e)\))*. [^()]* around \((?&e)\) are redundant since the [^()]+ alternative will consume the chars other than ( and ) on the current depth level.
Also, you quantified the Group e pattern making it a repeated capturing group that only keeps the text matched during the last iteration.
You may use
(?<name>\w+)\s*\(\K(?<e>[^()]*(?:\((?&e)\)[^()]*)*)(?=\))
See the regex demo
Details
(?<name>\w+)\s*\(\K - 1+ word chars, 0+ whitespaces and ( that are omitted from the match(?<e> - start of Group e
[^()]* - 0+ chars other than ( and )(?: - start of a non-capturing group:
\( - a ( char(?&e) - Group e pattern recursed\) - a )[^()]* - 0+ chars other than ( and ))* - 0 or more repetitions) - end of e group(?=\)) - a ) must be immediately to the right of the current location.The following regex does the matching without taking extra steps:
(?<name>\w+)\s*(\((?<e>([^()]*+|(?2))+)\))
See live demo here
But that doesn't match following strings that contain unbalanced parentheses in a quoted string:
foo(bar = ')')foo(bar(john = "(Doe..."))So what you should look for is:
(?<name>\w+)\s*(\((?<e>([^()'"]*+|"(?>[^"\\]*+|\\.)*"|'(?>[^'\\]*+|\\.)*'|(?2))+)\))
See live demo here
Regex breakdown:
(?<name>\w+)\s* Match function name and trailing spaces( Start of a cluster
\( Match a literal ((?<e> Start of named capturing group e
( Start of capturing group #2
[^()'"]*+ Match any thing except ()'"| Or"(?>[^"\\]*+|\\.)*" Match any thing between double quotes| Or '(?>[^'\\]*+|\\.)*' Match any thing between single quotes| Or(?2) Recurse second capturing group)+ Repeat as much as possible, at least once) End of capturing group\) Match ) literally) End of capturing groupIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With