Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract numerical percentage from string containing multiple numbers

I want to extract a numerical percentage in a string. Here are some cases:

  • input: "Bank ABC 123% CDE" -> output: 123.00 (as a float)
  • input: "Some random bank IPCA + 12,34%" -> output: 12.34
  • input: "Bank1 2,3%" -> output: 2.3

Commas are used solely as separators and there's only one percentage for each string, so the following strings will never occur:

  • invalid input: "Bank ABC, 123%"
  • invalid input: "Bank ABC 123% and 12,34%"

Currently, I'm using the following script in Python

def extract_percentage(x: str) -> float:
   float((re.sub(r'[^\d,]', '', x)).replace(',','.'))

It works for the first two examples above, but for the third, the output is 12.3

How should I do it? Preferably, using Python.

like image 892
Lucas Hattori Avatar asked Feb 11 '26 05:02

Lucas Hattori


1 Answers

Your regex removes spaces, as well as everything else. I think that to find something using regex, the best way is to search for it, using the re library.

We will start by looking for all strings ending with %: '.*%'. For Bank ABC 123% CDE this will return Bank ABC 123% CDE which, contains space and non-digits.

To improve on that, let's look for numbers with 1 comma or dot at most: \d*[,.]?\d*%, this will return 123% for your input

To wrap things up, let's replace the comma with a dot

import re

str = 'Bank1 2,3%'
arr = [x.replace(',','.') for x in re.findall('\d*[,.]?\d*%',str)]
print(arr)
>>> ['2.3%']

Note that the answer is an array of all matches

If you want to get the number out, you can now just do:

if len(arr)>0:
  number_without_percent_sign = arr[0][:-1]
  print(float(number_without_percent_sign))
>>> 2.3
like image 195
Shahar Bental Avatar answered Feb 13 '26 17:02

Shahar Bental



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!