I'm trying to separate the [0-9]
and [A-Z]
in strings like these:
100M
20M1D80M
20M1I79M
20M10000N80M
I tried using the Python re
module, and the following is the code I used:
>>>import re
>>>num_alpha = re.compile('(([0-9]+)([A-Z]))+')
>>>str1="100M"
>>>n_a_match = num_alpha.match(str1)
>>>n_a_match.group(2), n_a_match.group(3)
100,M #just what I want
>>>str1="20M10000N80M"
>>>n_a_match = num_alpha.match(str1)
>>>n_a_match.groups()
('80M', '80', 'M') #only the last one, how can I get the first two?
#expected result ('20M','20','M','10000N','10000','N','80M','80','M')
This regular expression works well for strings which contain only one match, but not several groups of matches. How can I handle that using regular expressions?
split(String regex) method splits this string around matches of the given regular expression. This method works in the same way as invoking the method i.e split(String regex, int limit) with the given expression and a limit argument of zero. Therefore, trailing empty strings are not included in the resulting array.
A regular expression can be used for searching for a string, searching within a string, or replacing one part of a string with another string.
Python Regex – Get List of all Numbers from String. To get the list of all numbers in a String, use the regular expression '[0-9]+' with re. findall() method. [0-9] represents a regular expression to match a single digit in the string.
Since regular expressions work with text, a regular expression engine treats 0 as a single character, and 255 as three characters. To match all characters from 0 to 255, we'll need a regex that matches between one and three characters. The regex [0-9] matches single-digit numbers 0 to 9.
I suggest using re.findall
. If you intend to iterate over the results, rather than building a list, you could use re.finditer
instead. Here's an example of how that would work:
>>> re.findall("(([0-9]+)([A-Z]))", "20M10000N80M")
[('20M', '20', 'M'), ('10000N', '10000', 'N'), ('80M', '80', 'M')]
If you don't want the combined numbers+letters string, you can remove the outer parentheses from the match and just get the separate parts:
>>> re.findall("([0-9]+)([A-Z])", "20M10000N80M")
[('20', 'M'), ('10000', 'N'), ('80', 'M')]
Or, if you don't want tuples at all (and you don't need to worry about malformed input, such as strings with several letters in a row), you could change the pattern to an alternation, and get the values one by one:
>>> re.findall("([0-9]+|[A-Z])", "20M10000N80M")
['20', 'M', '10000', 'N', '80', 'M']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With