I am looking to be able to split a string into a list around anything that is not a numeral or a dot. Currently the split method only provides a way of doing a positive match for split, is a regex the best route to take in this situation?
For example, given the string "10.23, 10.13.21; 10.1 10.5 and 10.23.32"
This should return the list ['10.23', '10.13.21', '10.1', '10.5', '10.23.32']
As such I believe the best regex to use in this situation would be...
[\d\.]+
Is this the best way to handle such a case?
In case you are thinking of re.findall
: you can use re.split
with an inverted version of your regex:
In [1]: import re
In [2]: s = "10.23, 10.13.21; 10.1 10.5 and 10.23.32"
In [3]: re.split(r'[^\d\.]+', s)
Out[3]: ['10.23', '10.13.21', '10.1', '10.5', '10.23.32']
If you want a solution other than regex, you could use str.translate
and translate everything other than '.0123456789'
into whitespace and make a call to split()
In [69]: mystr
Out[69]: '10.23, 10.13.21; 10.1 10.5 and 10.23.32'
In [70]: mystr.translate(' '*46 + '. ' + '0123456789' + ' '*198).split()
Out[70]: ['10.23', '10.13.21', '10.1', '10.5', '10.23.32']
Hope this helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With