I have a CSV string where some of the items might be enclosed by {} with commas inside. I wanted to collect the string values in a list.
What is the most pythonic way to collect the values in a list?
Example 1: 'a,b,c', expected output ['a', 'b', 'c']
Example 2: '{aa,ab}, b, c', expected output ['{aa,ab}','b','c']
Example 3: '{aa,ab}, {bb,b}, c', expected output ['{aa,ab}', '{bb,b}', 'c']
I have tried to work with s.split(','), it works for example 1 but will mess up for case 2 and 3.
I believe that this question (How to split but ignore separators in quoted strings, in python?) is very similar to my problem. But I can't figure out the proper regex syntax to use.
The solution is very similar in fact:
import re
PATTERN = re.compile(r'''\s*((?:[^,{]|\{[^{]*\})+)\s*''')
data = '{aa,ab}, {bb,b}, c'
print(PATTERN.split(data)[1::2])
will give:
['{aa,ab}', '{bb,b}', 'c']
A more readable way (at least to me) is to explain what you are looking for: either something between brackets { } or something that only contains alphanumeric characters:
import re
examples = [
'a,b,c',
'{aa,ab}, b, c',
'{aa,ab}, {bb,b}, c'
]
for example in examples:
print(re.findall(r'(\{.+?\}|\w+)', example))
It prints
['a', 'b', 'c']
['{aa,ab}', 'b', 'c']
['{aa,ab}', '{bb,b}', 'c']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With