I want to replace mm
units to cm
units in my code. In the case of the big amount of such replacements I use regexp
.
I made such expression:
(?!a-zA-Z)mm(?!a-zA-Z)
But it still matches words like summa
, gamma
and dummy
.
How to make up regexp correctly?
Use character classes and change the first (?!...)
lookahead into a lookbehind:
(?<![a-zA-Z])mm(?![a-zA-Z])
^^^^^^^^^^^^^ ^^^^^^^^^^^
See the regex demo
The pattern matches:
(?<![a-zA-Z])
- a negative lookbehind that fails the match if there is an ASCII letter immediately to the left of the current locationmm
- a literal substring(?![a-zA-Z])
- a negative lookahead that fails the match if there is an ASCII letter immediately to the right of the current locationNOTE: If you need to make your pattern Unicode-aware, replace [a-zA-Z]
with [^\W\d_]
(and use re.U
flag if you are using Python 2.x).
There's no need to use lookaheads and lookbehinds, so if you wish to simplify your pattern you can try something like this;
\d+\s?(mm)\b
This does assume that your millimetre symbol will always follow a number, with an optional space in-between, which I think that in this case is a reasonable assumption.
The \b
checks for a word boundary to make sure the mm
is not part of a word such as dummy
etc.
Demo here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With