Given a string:
str = "apple AND orange OR banana"
I want to split it by "AND" or "OR". The expected result is
['apple', 'orange', 'banana']
Is any simple way for python to do it?
Thanks!
You can use regex to split based on any combinations of uppercase letters with len 1 or more :
>>> tr = "apple AND orange OR banana"
>>> re.split(r'[A-Z]+',tr)
['apple ', ' orange ', ' banana']
But if you just want to split with AND
or OR
:
>>> re.split(r'AND|OR',tr)
['apple ', ' orange ', ' banana']
And for remove the spaces if you are sure that your sentence are contain distinc words you can do :
>>> re.split(r'[A-Z ]+',tr)
['apple', 'orange', 'banana']
If you have a AND
or OR
in leading or trailing of your string using split will create a empty string in result , for get ride of that you can loop over splited list and check for validation of items, but as a more elegant way you can use re.findall
:
with r'[^A-Z ]+'
as its pattern :
>>> tr = "AND apple AND orangeOR banana"
>>> re.split(r'\s?(?:AND|OR)\s?',tr)
['', 'apple', 'orange', 'banana']
>>> re.split(r'[A-Z ]+',tr)
['', 'apple', 'orange', 'banana']
>>> [i for i in re.split(r'[A-Z ]+',tr) if i]
['apple', 'orange', 'banana']
>>> re.findall(r'[^A-Z ]+',tr)
['apple', 'orange', 'banana']
I can think of two ways to accomplish this:
In [230]: s = "apple AND orange OR banana"
In [231]: delims = ["AND", "OR"]
In [232]: for d in delims:
.....: s = s.replace(d, '-')
.....:
In [233]: s.split('-')
Out[233]: ['apple ', ' orange ', ' banana']
OR
In [234]: s = "apple AND orange OR banana"
In [235]: delims = ["AND", "OR"]
In [236]: for d in delims:
.....: s = s.replace(d, ' ')
.....:
In [237]: s.split()
Out[237]: ['apple', 'orange', 'banana']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With