EDIT: Can anyone help me out with a regular expression for a string such as this?:
[Header 1], [Head,er 2], Header 3
so that I can split this into chunks like:
[Header 1]
[Head,er 2]
Header 3
I have gotten as far as this:
(?<=,|^).*?(?=,|$)
Which will give me:
[Header 1]
[Head
,er 2]
Header 3
In this case it's easier to split on the delimiters (commas) than to match the tokens (or chunks). Identifying the commas that are delimiters takes a relatively simple lookahead:
,(?=[^\]]*(?:\[|$))
Each time you find a comma, you do a lookahead for one of three things. If you find a closing square bracket first, the comma is inside a pair of brackets, so it's not a delimiter. If you find an opening bracket or the end of the line/string, it's a delimiter.
\[.*?\]
Forget the commas, you don't care about them. :)
Variations of this question have been discussed before.
For instance:
Short answer: Regular Expressions are probably not the right tool for this. Write a proper parser. A FSM implementation is easy.
(?<=,|^)\s*\[[^]]*\]\s*(?=,|$)
use the [
and ]
delimiters to your advantage
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With