vim regex replace multiple consecutive spaces with only one space

Tags:

I often work with text files which have a variable amount of whitespaces as word separators (text processors like Word do this, to distribute fairly the whitespace amount due to different sizes of letters in certain fonts and they put this annoying variable amount of spaces even when saving as plain text).

I would like to automate the process of replacing these sequences of whitespaces that have variable length with single spaces. I suspect a regex could do it, but there are also whitespaces at the beginning of paragraphs (usually four of them, but not always), which I would want to let unchanged, so basically my regex should also not touch the leading whitespaces and this adds to the complexity.

I'm using vim, so a regex in the vim regex dialect would be very useful to me, if this is doable.

My current progress looks like this:

:%s/ \+/ /g

but it doesn't work correctly.

I'm also considering to write a vim script that could parse text lines one by one, process each line char by char and skip the whitespaces after the first one, but I have a feeling this would be overkill.

676

asked Oct 05 '10 02:10

jedi_coder

2 Answers

this will replace 2 or more spaces

s/ \{2,}/ /g

or you could add an extra space before the \+ to your version

s/  \+/ /g

158

answered Sep 18 '22 12:09

mikerobi

This will do the trick:

%s![^ ]\zs  \+! !g

Many substitutions can be done in Vim easier than with other regex dialects by using the \zs and \ze meta-sequences. What they do is to exclude part of the match from the final result, either the part before the sequence (\zs, “s” for “start here”) or the part after (\ze, “e” for “end here”). In this case, the pattern must match one non-space character first ([^ ]) but the following \zs says that the final match result (which is what will be replaced) starts after that character.

Since there is no way to have a non-space character in front of line-leading whitespace, it will be not be matched by the pattern, so the substitution will not replace it. Simple.

answered Sep 19 '22 12:09

Aristotle Pagaltzis

Related questions
                            
                                Match groups in Python
                            
                                Regex to match specific strings without a given prefix
                            
                                Why 'ABC'.replace('B', '$`') gives AAC
                            
                                Regular expression to find two strings anywhere in input
                            
                                How to split but ignore separators in quoted strings, in python?
                            
                                How to validate numeric values which may contain dots or commas?
                            
                                Splitting on comma outside quotes [duplicate]
                            
                                Regular Expression Groups in C#
                            
                                bash, extract string before a colon
                            
                                Removing all script tags from html with JS Regular Expression
                            
                                List of all characters that should be escaped before put in to RegEx?
                            
                                How to use regex in XPath "contains" function
                            
                                Regular expression matching E.164 formatted phone numbers
                            
                                Split a string by commas but ignore commas within double-quotes using Javascript
                            
                                Including a hyphen in a regex character bracket?
                            
                                replace \n and \r\n with <br /> in java
                            
                                How can I insert a tab character with sed on OS X?
                            
                                Regex to detect one of several strings
                            
                                Remove control characters from PHP string
                            
                                Compare one String with multiple values in one expression

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With