Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to insert space in vim

Tags:

regex

vim

I am a regex supernoob (just reading my first articles about them), and at the same time working towards stronger use of vim. I would like to use a regex to search for all instances of a colon : that are not followed by a space and insert one space between those colons and any character after them.

If I start with:

foo:bar

I would like to end with

foo: bar

I got as far as %s/:[a-z] but now I don't know what do for the next part of the %s statement.

Also, how do I change the :[a-z] statement to make sure it catches anything that is not a space?

like image 610
Lee Quarella Avatar asked Dec 06 '11 19:12

Lee Quarella


3 Answers

:%s/:\(\S\)/: \1/g

\S matches any character that is not whitespace, but you need to remember what that non-whitespace character is. This is what the \(\) does. You can then refer to it using \1 in the replacement.

So you match a :, some non-whitespace character and then replace it with a :, a space, and the captured character.


Changing this to only modify the text when there's only one : is fairly straight forward. As others have suggested, using some of the zero-width assertions will be useful.

:%s/:\@!<:[^:[:space:]]\@=/: /g

  • :\@!< matches any non-:, including the start of the line. This is an important characteristic of the negative lookahead/lookbehind assertions. It's not requiring that there actually be a character, just that there isn't a :.

  • : matches the required colon.

  • [^:[:space:]] introduces a couple more regex concepts.

    • The outer [] is a collection. A collection is used to match any of the characters listed inside. However, a leading ^ negates that match. So, [abc123] will match a, b, c, 1, 2, or 3, but [^abc123] matches anything but those characters.

    • [:space:] is a character class. Character classes can only be used inside a collection. [:space:] means, unsurprisingly, any whitespace. In most implementations, it relates directly to the result of the C library's isspace function.

    Tying that all together, the collection means "match any character that is not a : or whitespace".

  • \@= is the positive lookahead assertion. It applies to the previous atom (in this case the collection) and means that the collection is required for the pattern to be a successful match, but will not be part of the text that is replaced.

So, whenever the pattern matches, we just replace the : with itself and a space.

like image 135
jamessan Avatar answered Oct 21 '22 01:10

jamessan


You want to use a zero-width negative lookahead assertion, which is a fancy way of saying look for a character that's not a space, but don't include it in the match:

:%s/: \@!/: /g

The \@! is the negative lookahead.

like image 7
Karl Bielefeldt Avatar answered Oct 21 '22 01:10

Karl Bielefeldt


An interesting feature of Vim regex is the presence of \zs and \ze. Other engines might have them too, but they're not very common.

The purpose of \zs is to mark the start of the match, and \ze the end of it. For example:

ab\zsc

matches c, only if before you have ab. Similarly:

a\zebc

matches a only if you have bc after it. You can mix both:

a\zsb\zec

matches b only if in between a and c. You can also create zero-width matches, which are ideal for what you're trying to do:

:%s/:\zs\ze\S/ /

Your search has no size, only a position. And them you substitute that position by " ". By the way, \S means any character but white space ones.

:\zs\ze\S matches the position between a colon and something not a space.

like image 7
sidyll Avatar answered Oct 21 '22 00:10

sidyll