Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python regex: get end digits from a string

Tags:

python

regex

I am quite new to python and regex (regex newbie here), and I have the following simple string:

s=r"""99-my-name-is-John-Smith-6376827-%^-1-2-767980716""" 

I would like to extract only the last digits in the above string i.e 767980716 and I was wondering how I could achieve this using python regex.

I wanted to do something similar along the lines of:

re.compile(r"""-(.*?)""").search(str(s)).group(1) 

indicating that I want to find the stuff in between (.*?) which starts with a "-" and ends at the end of string - but this returns nothing..

I was wondering if anyone could point me in the right direction.. Thanks.

like image 799
JohnJ Avatar asked Nov 22 '12 19:11

JohnJ


People also ask

How do you get the last digit of a string in Python?

The last character of a string has index position -1. So, to get the last character from a string, pass -1 in the square brackets i.e. It returned a copy of the last character in the string. You can use it to check its content or print it etc.

How do you get numbers out of a string in Python?

To find numbers from a given string in Python we can easily apply the isdigit() method. In Python the isdigit() method returns True if all the digit characters contain in the input string and this function extracts the digits from the string. If no character is a digit in the given string then it will return False.

How do I extract digits from a string?

The following example shows how you can use the replaceAll() method to extract all digits from a string in Java: // string contains numbers String str = "The price of the book is $49"; // extract digits only from strings String numberOnly = str. replaceAll("[^0-9]", ""); // print the digitts System. out.


2 Answers

You can use re.match to find only the characters:

>>> import re >>> s=r"""99-my-name-is-John-Smith-6376827-%^-1-2-767980716""" >>> re.match('.*?([0-9]+)$', s).group(1) '767980716' 

Alternatively, re.finditer works just as well:

>>> next(re.finditer(r'\d+$', s)).group(0) '767980716' 

Explanation of all regexp components:

  • .*? is a non-greedy match and consumes only as much as possible (a greedy match would consume everything except for the last digit).
  • [0-9] and \d are two different ways of capturing digits. Note that the latter also matches digits in other writing schemes, like ୪ or ൨.
  • Parentheses (()) make the content of the expression a group, which can be retrieved with group(1) (or 2 for the second group, 0 for the whole match).
  • + means multiple entries (at least one number at the end).
  • $ matches only the end of the input.
like image 55
phihag Avatar answered Sep 28 '22 05:09

phihag


Nice and simple with findall:

import re  s=r"""99-my-name-is-John-Smith-6376827-%^-1-2-767980716"""  print re.findall('^.*-([0-9]+)$',s)  >>> ['767980716'] 

Regex Explanation:

^         # Match the start of the string .*        # Followed by anthing -         # Upto the last hyphen ([0-9]+)  # Capture the digits after the hyphen $         # Upto the end of the string 

Or more simply just match the digits followed at the end of the string '([0-9]+)$'

like image 25
Chris Seymour Avatar answered Sep 28 '22 04:09

Chris Seymour