Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: how to split a string by multiple strings

Given a string:

str = "apple AND orange OR banana"

I want to split it by "AND" or "OR". The expected result is

['apple', 'orange', 'banana']

Is any simple way for python to do it?

Thanks!

like image 952
Amy Avatar asked Dec 15 '22 16:12

Amy


2 Answers

You can use regex to split based on any combinations of uppercase letters with len 1 or more :

>>> tr = "apple AND orange OR banana"
>>> re.split(r'[A-Z]+',tr)
['apple ', ' orange ', ' banana']

But if you just want to split with AND or OR :

>>> re.split(r'AND|OR',tr)
['apple ', ' orange ', ' banana']

And for remove the spaces if you are sure that your sentence are contain distinc words you can do :

>>> re.split(r'[A-Z ]+',tr)
['apple', 'orange', 'banana']

If you have a AND or OR in leading or trailing of your string using split will create a empty string in result , for get ride of that you can loop over splited list and check for validation of items, but as a more elegant way you can use re.findall : with r'[^A-Z ]+' as its pattern :

>>> tr = "AND apple AND orangeOR banana"
>>> re.split(r'\s?(?:AND|OR)\s?',tr)
['', 'apple', 'orange', 'banana']
>>> re.split(r'[A-Z ]+',tr)
['', 'apple', 'orange', 'banana']
>>> [i for i in re.split(r'[A-Z ]+',tr) if i]
['apple', 'orange', 'banana']
>>> re.findall(r'[^A-Z ]+',tr)
['apple', 'orange', 'banana']
like image 57
Mazdak Avatar answered Dec 26 '22 14:12

Mazdak


I can think of two ways to accomplish this:

In [230]: s = "apple AND orange OR banana"

In [231]: delims = ["AND", "OR"]

In [232]: for d in delims:
   .....:     s = s.replace(d, '-')
   .....:     

In [233]: s.split('-')
Out[233]: ['apple ', ' orange ', ' banana']

OR

In [234]: s = "apple AND orange OR banana"

In [235]: delims = ["AND", "OR"]

In [236]: for d in delims:
   .....:     s = s.replace(d, ' ')
   .....:     

In [237]: s.split()
Out[237]: ['apple', 'orange', 'banana']
like image 28
inspectorG4dget Avatar answered Dec 26 '22 15:12

inspectorG4dget