I found some answers online, but I have no experience with regular expressions, which I believe is what is needed here.
I have a string that needs to be split by either a ';' or ', ' That is, it has to be either a semicolon or a comma followed by a space. Individual commas without trailing spaces should be left untouched
Example string:
"b-staged divinylsiloxane-bis-benzocyclobutene [124221-30-3], mesitylene [000108-67-8]; polymerized 1,2-dihydro-2,2,4- trimethyl quinoline [026780-96-1]"
should be split into a list containing the following:
('b-staged divinylsiloxane-bis-benzocyclobutene [124221-30-3]' , 'mesitylene [000108-67-8]', 'polymerized 1,2-dihydro-2,2,4- trimethyl quinoline [026780-96-1]')
To split a string with multiple delimiters in Python, use the re. split() method. The re. split() function splits the string by each occurrence of the pattern.
Use the String. split() method to split a string with multiple separators, e.g. str. split(/[-_]+/) . The split method can be passed a regular expression containing multiple characters to split the string with multiple separators.
split() This is the most efficient and commonly used method to split on multiple characters at once. It makes use of regex(regular expressions) in order to this. The line re.
Use split() method to split by delimiter. If the argument is omitted, it will be split by whitespace, such as spaces, newlines \n , and tabs \t . Consecutive whitespace is processed together. A list of the words is returned.
Luckily, Python has this built-in :)
import re re.split('; |, ',str)
Update:
Following your comment:
>>> a='Beautiful, is; better*than\nugly' >>> import re >>> re.split('; |, |\*|\n',a) ['Beautiful', 'is', 'better', 'than', 'ugly']
Do a str.replace('; ', ', ')
and then a str.split(', ')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With