Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split string with multiple delimiters in Python [duplicate]

I found some answers online, but I have no experience with regular expressions, which I believe is what is needed here.

I have a string that needs to be split by either a ';' or ', ' That is, it has to be either a semicolon or a comma followed by a space. Individual commas without trailing spaces should be left untouched

Example string:

"b-staged divinylsiloxane-bis-benzocyclobutene [124221-30-3], mesitylene [000108-67-8]; polymerized 1,2-dihydro-2,2,4- trimethyl quinoline [026780-96-1]" 

should be split into a list containing the following:

('b-staged divinylsiloxane-bis-benzocyclobutene [124221-30-3]' , 'mesitylene [000108-67-8]', 'polymerized 1,2-dihydro-2,2,4- trimethyl quinoline [026780-96-1]')  
like image 650
gt565k Avatar asked Feb 14 '11 23:02

gt565k


People also ask

Can you split a string in Python with multiple delimiters?

To split a string with multiple delimiters in Python, use the re. split() method. The re. split() function splits the string by each occurrence of the pattern.

How do you split with multiple separators?

Use the String. split() method to split a string with multiple separators, e.g. str. split(/[-_]+/) . The split method can be passed a regular expression containing multiple characters to split the string with multiple separators.

How do you split multiple strings in Python?

split() This is the most efficient and commonly used method to split on multiple characters at once. It makes use of regex(regular expressions) in order to this. The line re.

How do you split delimiter in Python?

Use split() method to split by delimiter. If the argument is omitted, it will be split by whitespace, such as spaces, newlines \n , and tabs \t . Consecutive whitespace is processed together. A list of the words is returned.


2 Answers

Luckily, Python has this built-in :)

import re re.split('; |, ',str) 

Update:
Following your comment:

>>> a='Beautiful, is; better*than\nugly' >>> import re >>> re.split('; |, |\*|\n',a) ['Beautiful', 'is', 'better', 'than', 'ugly'] 
like image 87
Jonathan Livni Avatar answered Sep 22 '22 05:09

Jonathan Livni


Do a str.replace('; ', ', ') and then a str.split(', ')

like image 35
Joe Avatar answered Sep 19 '22 05:09

Joe