I want to split a string by comma:
"a,s".split ',' # => ['a', 's']
I don't want to split a sub-string if it is wrapped by parenthesis:
"a,s(d,f),g,h"
should yield:
['a', 's(d,f)', 'g', 'h']
Any suggestion?
To deal with nested parenthesis, you can use:
txt = "a,s(d,f(4,5)),g,h"
pattern = Regexp.new('((?:[^,(]+|(\((?>[^()]+|\g<-1>)*\)))+)')
puts txt.scan(pattern).map &:first
pattern details:
( # first capturing group
(?: # open a non capturing group
[^,(]+ # all characters except , and (
| # or
( # open the second capturing group
\( # (
(?> # open an atomic group
[^()]+ # all characters except parenthesis
| # OR
\g<-1> # the last capturing group (you can also write \g<2>)
)* # close the atomic group
\) # )
) # close the second capturing group
)+ # close the non-capturing group and repeat it
) # close the first capturing group
The second capturing group describe the nested parenthesis that can contain characters that are not parenthesis or the capturing group itself. It's a recursive pattern.
Inside the pattern, you can refer to a capture group with his number (\g<2>
for the second capturing group) or with his relative position (\g<-1>
the first on the left from the current position in the pattern) (or with his name if you use named capturing groups)
Notice: You can allow single parenthesis if you add |[()]
before the end of the non-capturing group. Then a,b(,c
will give you ['a', 'b(', 'c']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With