Python metacharacter negation.
After scouring the net and writing a few different syntaxes I'm out of ideas.
Trying to rename some files. They have a year in the title e.g. [2002]. Some don't have the brackets, which I want to rectify.
So I'm trying to find a regex (that I can compile preferably) that in my mind looks something like (^[\d4^])
because I want the set of 4 numbers that don't have square brackets around them. I'm using the brackets in the hope of binding this so that I can then rename using something like [\1]
.
If you want to check for things around a pattern you can use lookahead and lookbehind assertions. These don't form part of the match but say what you expect to find (or not find) around it.
As we don't want brackets we'll need use a negative lookbehind and lookahead.
A negative lookahead looks like this (?!...)
where it matches if ...
does not come next. Similarly a negative lookbehind looks like this (?<!...)
and matches if ...
does not come before.
Our example is make slightly more complicated because we're using [
and ]
which themselves have meaning in regular expressions so we have to escape them with \
.
So we can build up a pattern as follows:
[
- (?<!\[)
\d{4}
]
- (?!\])
This gives us the following Python code:
>>> import re
>>> r = re.compile("(?<!\[)\d{4}(?!\])")
>>> r.match(" 2011 ")
>>> r.search(" 2011 ")
<_sre.SRE_Match object at 0x10884de00>
>>> r.search("[2011]")
To rename you can use the re.sub
function or the sub
function on your compiled pattern. To make it work you'll need to add an extra set of brackets around the year to mark it as a group.
Also, when specifying your replacement you refer to the group as \1
and so you have to escape the \
or use a raw string.
>>> r = re.compile("(?<!\[)(\d{4})(?!\])")
>>> name = "2011 - This Year"
>>> r.sub(r"[\1]",name)
'[2011] - This Year'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With