I am trying to separate non-numbers from numbers in a Python string. Numbers can include floats.
Original String Desired String
'4x5x6' '4 x 5 x 6'
'7.2volt' '7.2 volt'
'60BTU' '60 BTU'
'20v' '20 v'
'4*5' '4 * 5'
'24in' '24 in'
Here is a very good thread on how to achieve just that in PHP:
Regex: Add space if letter is adjacent to a number
I would like to manipulate the strings above in Python.
Following piece of code works in the first example, but not in the others:
new_element = []
result = [re.split(r'(\d+)', s) for s in (unit)]
for elements in result:
for element in elements:
if element != '':
new_element.append(element)
new_element = ' '.join(new_element)
break
Explanation : Numbers separated from Characters by space. Method #1 : Using regex + sub () + lambda In this, we perform the task of finding alphabets by appropriate regex and then sub () is used to do replacements, lambda does the task of adding spaces in between. Python3
4) Using the regex sub () function with the replacement as a function Suppose you have a list of strings where each element contain both alphabet and number: And you want to square the number in each list element. For example, A1 becomes A1, A2 becomes A4, and A3 becomes A9.
RegEx can be used to check if a string contains the specified search pattern. Python has a built-in package called re, which can be used to work with Regular Expressions. When you have imported the re module, you can start using regular expressions: The re module offers a set of functions that allows us to search a string for a match:
Summary: in this tutorial, you’ll learn about the Python regex sub () function that returns a string after replacing the matched pattern in a string with a replacement. The sub () is a function in the built-in re module that handles regular expressions. The sub () function has the following syntax:
Easy! Just replace it and use Regex variable. Don't forget to strip whitespaces. Please try this code:
import re
the_str = "4x5x6"
print re.sub(r"([0-9]+(\.[0-9]+)?)",r" \1 ", the_str).strip() // \1 refers to first variable in ()
I used split, like you did, but modified it like this:
>>> tcs = ['123', 'abc', '4x5x6', '7.2volt', '60BTU', '20v', '4*5', '24in', 'google.com-1.2', '1.2.3']
>>> pattern = r'(-?[0-9]+\.?[0-9]*)'
>>> for test in tcs: print(repr(test), repr(' '.join(segment for segment in re.split(pattern, test) if segment)))
'123' '123'
'abc' 'abc'
'4x5x6' '4 x 5 x 6'
'7.2volt' '7.2 volt'
'60BTU' '60 BTU'
'20v' '20 v'
'4*5' '4 * 5'
'24in' '24 in'
'google.com-1.2' 'google.com -1.2'
'1.2.3' '1.2 . 3'
Seems to have the desired behavior.
Note that you have to remove empty strings from the beginning/end of the array before joining the string. See this question for an explanation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With