After some search this seems more difficult than I thought: I am trying to write a regular expression in Python to find a word which is not surrounded by other letters or dashes.
In the following examples, I am trying to match ios
:
The matches should be as follows:
ios
is after d
.ios
is not surrounded by letters.ios
is not surrounded by letters.ios
is not surrounded by letters.ios
is followed by -
.How to do it with regex?
A metacharacter is a character that has a special meaning during pattern processing. You use metacharacters in regular expressions to define the search criteria and any text manipulations.
You can use \b
to match the empty string at the start or end of a word.
However, to also disallow -
we have to use a character class containing
both, then invert it. That would look something like this:
[^\b-]
Let's pick that apart. []
is the character class itself. ^
at the start
says to invert the match, so only characters not in the character class
match. Note that -
has to come last (or perhaps first) in a character class,
otherwise it will be mistaken for a range. (This allows you to say [0-9a-fA-F]
as
a shorthand for all hexadecimals.)
Let's try it! Here's your test file:
$ cat t.txt
It seems carpedios
I like "ios" because they have blue products
I like carpedios and ios
I like carpedios and ios.
i like carped-ios
Let's put together our pattern using the character classes above:
$ grep '[^\b-]ios[^\b-]' t.txt
I like "ios" because they have blue products
I like carpedios and ios
I like carpedios and ios.
Hope this helps!
Update: I notice there's a good alternative answer, but I hope this adds some extra explanation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With