Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove double quotes enclosing numbers using regex

Tags:

python

regex

I'm working with the following string:

'"name": "Gnosis", \n        "symbol": "GNO", \n        "rank": "99", \n        "price_usd": "175.029", \n        "price_btc": "0.0186887", \n        "24h_volume_usd": "753877.0"'

and I have to use re.sub() in python to replace only the double quotes (") that are enclosing the numbers, in order to parse it later in JSON. I've tried with some regular expressions, but without success. Here is my best attempt:

exp = re.compile(r': (")\D+\.*\D*(")', re.MULTILINE)
response = re.sub(exp, "", string)

I've searched a lot for a similar problem, but have not found another similar question.

EDIT:

Finally I've used (thanks to S. Kablar):

fomatted = re.sub(r'"(-*\d+(?:\.\d+)?)"', r"\1", string)
parsed = json.loads(formatted)

The problem is that this endpoint returns a bad formatted string as JSON.

Other users answered "Parse the string first with json, and later convert numbers to float" with a for loop and, I think, is a very inneficient way to do it, also, you will be forced to select between int or float type for your response. To get out of doubt, I've wrote this gist where I show you the comparations between the different approachs with benchmarking, and for now I'm going to trust in regex in this case.

Thanks everyone for your help

like image 649
Álvaro Mondéjar Avatar asked Jan 03 '23 20:01

Álvaro Mondéjar


2 Answers

Regex: "(-?\d+(?:[\.,]\d+)?)" Substitution: \1

Details:

  • () Capturing group
  • (?:) Non capturing group
  • \d Matches a digit (equal to [0-9])
  • + Matches between one and unlimited times
  • ? Matches between zero and one times
  • \1 Group 1.

Python code:

def remove_quotes(text):
    return re.sub(r"\"(-?\d+(?:[\.,]\d+)?)\"", r'\1', text)

remove_quotes('"percent_change_7d": "-23.43"') >> "percent_change_7d": -23.43
like image 187
Srdjan M. Avatar answered Jan 05 '23 10:01

Srdjan M.


Parse the string first with json, and later convert numbers to floats:

string = '{"name": "Gnosis", \n        "symbol": "GNO", \n        "rank": "99", \n        "price_usd": "175.029", \n        "price_btc": "0.0186887", \n        "24h_volume_usd": "753877.0"}'

data = json.loads(string)
response = {}
for key, value in data.items():
    try:
        value = int(value) if value.strip().isdigit() else float(value)
    except ValueError:
        pass
    response[key] = value
like image 30
Daniel Avatar answered Jan 05 '23 10:01

Daniel