My string contains
text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
I want to split this in list like
["Baghdad, Iraq","United Arab Emirates (possibly)"]
The code which i have used is not providing me the desired result
re.split('\\s*([a-zA-Z\\d][).]|•)\\s*(?=[A-Z])', text)
Please help me regarding this
You could create the wanted data for your example using a list comp and a second regex:
import re
text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
# different 1.regex pattern, same result - refining with 2nd pattern
data = [x for x in re.split(r'((?:^\s*[a-zA-Z0-9]\))|(?:\s+[a-zA-Z0-9]\)))\s*',
text) if x and not re.match(r"\s*[a-zA-Z]\)",x)]
print(data)
Output:
['Baghdad, Iraq', 'United Arab Emirates (possibly)']
See https://regex101.com/r/wxEEQW/1
Instead of re.findall
, you can simply use re.split
:
import re
text = "a) Baghdad, Iraq b) United Arab Emirates (possibly)"
countries = list(filter(None, map(str.rstrip, re.split('\w\)\s', text))))
Output:
['Baghdad, Iraq', 'United Arab Emirates (possibly)']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With