Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tell RegEx to ignore parenthesis when inside a quote

I have the following RegEx that is used and works:

/\B@(@?\w+(?:::\w+)?)([ \t]*)(\( ( (?>[^()]+) | (?3) )* \))?/x

Where this string @extends('template', 'test') correctly groups and gives me what I need.

The problem is if the string contains an unclosed parenthesis inside the quotes - it will fail:

@extends('template', 'te)st') gives @extends('template', 'te) as the output

How can I tell this RegEx to ignore parenthesis that are inside quotes (either ' or ")

Here is a RegExr demo of the problem: http://regexr.com/v1?396ci

And here is a list of strings that should all be able to pass:

@extends('template', 'test')     // working
@extends('template', $test)      // working
@extends('template', 'te()st')   // working
@extends('template', 'te)st')    // broken 
@extends('template', 'te())st')  // broken
@extends('template', 'te(st')    // broken
@extends('template', 'test)')    // broken
@extends('template', '(test')    // broken

I've narrowed it down - and I think I need to be able to say

(
   \(  <-- only if not inside quotes
     ( 
         (?>[^()]+) | (?3) 
     )* 
   \) <-- only if not inside quotes  
)?

But I cant seem to work out how to apply that rule to these specific parenthesis

like image 768
Laurence Avatar asked Oct 18 '22 12:10

Laurence


1 Answers

You can use lookahead for this purpose

Here's my regex that will match to the second argument of all the extends

(?=(\w+)|\w+())[\w)(]+

Breakdown:

' : Start the search for string with quote

?=XXX) : Positive look ahead which ensures XXX is present ahead

(\w+\)|\w+\() : Search for either opening or closing braces

Now if this look ahead was successful we can be sure that we have a quote followed by a parenthesis. Now we can simply write the regex to make parenthesis

[\w\)\(]+ : Doing just that

Now that we can locate the quotes with parenthesis inside it, we can use the if-else condition to use appropriate rules for each case

(?(?=regex)then|else)

Here's how I've implemented it :

(?(?='(?=(\w+\)|\w+\())) <- condition, same as above
'[\w\)\(]+' <- We have a match so we ignore parenthesis
|'\w+' <- Here we don't
)

ps. I did not understand a lot of what you've written for other part in your regex, maybe it's to cover some other cases so I'm not writing up to modify your original regex. You can simply switch the check for second parameter with the one mentioned above

Here's my regex which matches to all your cases.

\B@\w+\('[\w+\s]+',\s+(?(?='(?=(\w+\)|\w+\()))'[\w\)\(]+'|('\w+'|\$\w+))\)

You can see the test cases here

PS. Just to show that it actually works, I've added a few failing test cases

like image 107
Mayank Raj Avatar answered Nov 04 '22 20:11

Mayank Raj