I have my Regular Expression /'(.*)(?:(?:'\s*,\s*)|(?:'\)))/
and my test code ('He said, "You're cool."' , 'Rawr')
(My test code simulates parameters being passed into a function.)
I will explain my Regular Expression as I understand it and hopefully a few of you can shed some light on my problem.
1)/' means at the beginning of the matched string, there needs to be '
2)(.*) means capture any character except \n 0 or more times
3)(?:(?:4)|(?:5)) means don't capture but try to do step 4 and if it doesn't work try step 5
4)(?:'\s*,\s*) means don't capture but there needs to be a ' with 0 or more whitespace characters followed by a , with 0 or more whitespace characters
5)(?:'\)) means don't capture but there needs to be ')
So it seems that it should return this (and this is what I want): '+He said, "You're cool."+' ,
But it returns: '+He said, "You're cool."' , 'Rawr+')
If I change my test code to ('He said, "You're cool."' , 'Rawr' (no end parenthesis) it returns what I want, but as soon as I add that last parenthesis, then it seems that my OR operator does whatever it wants to. I want it to test first if there is a comma, and break there if there is one, and if there is not one check for a parenthesis.
I've tried switching the spots of step 4 and step 5, but still the OR operator seems to always default to the (?:'\)) side.
How can I match the shortest amount possible?
I don't think your problem is the OR operator, but the greediness of the .*. It will match your full string, and then back-track until the following expressions match. The first match in this backtracking process will be 'He said, "You're cool."' , 'Rawr+'). Try .*? instead!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With