I am a regex supernoob (just reading my first articles about them), and at the same time working towards stronger use of vim. I would like to use a regex to search for all instances of a colon :
that are not followed by a space and insert one space between those colons and any character after them.
If I start with:
foo:bar
I would like to end with
foo: bar
I got as far as %s/:[a-z]
but now I don't know what do for the next part of the %s
statement.
Also, how do I change the :[a-z]
statement to make sure it catches anything that is not a space?
:%s/:\(\S\)/: \1/g
\S
matches any character that is not whitespace, but you need to remember what that non-whitespace character is. This is what the \(\)
does. You can then refer to it using \1
in the replacement.
So you match a :
, some non-whitespace character and then replace it with a :
, a space, and the captured character.
Changing this to only modify the text when there's only one :
is fairly straight forward. As others have suggested, using some of the zero-width assertions will be useful.
:%s/:\@!<:[^:[:space:]]\@=/: /g
:\@!<
matches any non-:
, including the start of the line. This is an important characteristic of the negative lookahead/lookbehind assertions. It's not requiring that there actually be a character, just that there isn't a :
.
:
matches the required colon.
[^:[:space:]]
introduces a couple more regex concepts.
The outer []
is a collection. A collection is used to match any of the characters listed inside. However, a leading ^
negates that match. So, [abc123]
will match a
, b
, c
, 1
, 2
, or 3
, but [^abc123]
matches anything but those characters.
[:space:]
is a character class. Character classes can only be used inside a collection. [:space:]
means, unsurprisingly, any whitespace. In most implementations, it relates directly to the result of the C library's isspace
function.
Tying that all together, the collection means "match any character that is not a :
or whitespace".
\@=
is the positive lookahead assertion. It applies to the previous atom (in this case the collection) and means that the collection is required for the pattern to be a successful match, but will not be part of the text that is replaced.
So, whenever the pattern matches, we just replace the :
with itself and a space.
You want to use a zero-width negative lookahead assertion, which is a fancy way of saying look for a character that's not a space, but don't include it in the match:
:%s/: \@!/: /g
The \@!
is the negative lookahead.
An interesting feature of Vim regex is the presence of \zs
and \ze
. Other engines might have them too, but they're not very common.
The purpose of \zs
is to mark the start of the match, and \ze
the end of it. For example:
ab\zsc
matches c
, only if before you have ab
. Similarly:
a\zebc
matches a
only if you have bc
after it. You can mix both:
a\zsb\zec
matches b
only if in between a
and c
. You can also create zero-width matches, which are ideal for what you're trying to do:
:%s/:\zs\ze\S/ /
Your search has no size, only a position. And them you substitute that position by " ". By the way, \S
means any character but white space ones.
:\zs\ze\S
matches the position between a colon and something not a space.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With