After some search this seems more difficult than I thought: I am trying to write a regular expression in Python to find a word which is not surrounded by other letters or dashes.
In the following examples, I am trying to match ios
:
The matches should be as follows:
ios
is after d
.ios
is not surrounded by letters.ios
is not surrounded by letters.ios
is not surrounded by letters.ios
is followed by -
.How to do it with regex?
A metacharacter is a character that has a special meaning during pattern processing. You use metacharacters in regular expressions to define the search criteria and any text manipulations.
The following one should suit your needs:
(?<!-)\bios\b(?!-)
Debuggex Demo
You can use \b
to match the empty string at the start or end of a word.
However, to also disallow -
we have to use a character class containing
both, then invert it. That would look something like this:
[^\b-]
Let's pick that apart. []
is the character class itself. ^
at the start
says to invert the match, so only characters not in the character class
match. Note that -
has to come last (or perhaps first) in a character class,
otherwise it will be mistaken for a range. (This allows you to say [0-9a-fA-F]
as
a shorthand for all hexadecimals.)
Let's try it! Here's your test file:
$ cat t.txt
It seems carpedios
I like "ios" because they have blue products
I like carpedios and ios
I like carpedios and ios.
i like carped-ios
Let's put together our pattern using the character classes above:
$ grep '[^\b-]ios[^\b-]' t.txt
I like "ios" because they have blue products
I like carpedios and ios
I like carpedios and ios.
Hope this helps!
Update: I notice there's a good alternative answer, but I hope this adds some extra explanation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With