Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression sub

Tags:

python

regex

I have a question about regular expression sub in python. So, I have some lines of code and what I want is to replace all floating point values eg: 2.0f,-1.0f...etc..to doubles 2.0,-1.0. I came up with this regular expression '[-+]?[0-9]*\.?[0-9]+f' and it finds what I need but I am not sure how to replace it?

so here's what I have:

# check if floating point value exists
if re.findall('[-+]?[0-9]*\.?[0-9]+f', line):
    line = re.sub('[-+]?[0-9]*\.?[0-9]+f', ????? ,line)

I am not sure what to put under ????? such that it will replace what I found in '[-+]?[0-9]*\.?[0-9]+f' without the char f in the end of the string.

Also there might be more than one floating point values, which is why I used re.findall

Any help would be great. Thanks

like image 345
overloading Avatar asked Jan 30 '26 07:01

overloading


2 Answers

Capture the part of the text you want to save in a capturing group and use the \1 substitution operator:

line = re.sub(r'([-+]?[0-9]*\.?[0-9]+)f', r'\1' ,line)

Note that findall (or any kind of searching) is unnecessary since re.sub will look for the pattern itself and return the string unchanged if there are no matches.

Now, for several regular expression writing tips:

  • Always use raw strings (r'...') for regular expressions and substitution strings, otherwise you will need to double your backslashes to escape them from Python's string parser. It is only by accident that you didn't need to do this for \., since . is not part of an escape sequence in Python strings.

  • Use \d instead of [0-9] to match a digit. They are equivalent, but \d is easier to recognize for "digit", while [0-9] needs to be visually verified.

  • Your regular expression will not recognize 10.f, which is likely a valid decimal number in your input. Matching floating-point numbers in various formats is trickier than it seems at first, but simple googling will reveal many reasonably complete solutions for this.

  • The re.X flag will allow you to add arbitrary whitespace and even comments to your regexp. With small regexps that can seem downright silly, but for large expressions the added clarity is a life-saver. (Your regular expression is close to the threshold.)

Here is an example of an extended regular expression that implements the above style tips:

line = re.sub(r'''
    ( [-+]?
      (?: \d+ (?: \.\d* )?    # 12 or 12. or 12.34
          |
          \.\d+               # .12
      )
    ) f''',
    r'\1', line, flags=re.X)

((?:...) is a non-capturing group, only used for precedence.)

like image 163
user4815162342 Avatar answered Feb 01 '26 19:02

user4815162342


This is my goto reference for all things regex.

http://www.regular-expressions.info/named.html

The result should be something like:

line = re.sub('(<first>[-+]?[0-9]*\).?[0-9]+f', '\g<first>', line)
like image 37
jTC Avatar answered Feb 01 '26 19:02

jTC