Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using vim to replace not matching strings that occur a variable number of times

Tags:

regex

replace

vim

I'm looking to use vim to extract only the square brackets and the number inside from a file containing the following example text:

13_[4]_3_[4]_[1]_5_[1]_29_[3]_4_[2]_9_[1]_6_[2]_4
14_[4]_28_[3]_4_[2]_12_[1]_8_[2]_2
[1]_[4]_15_[1]_16_[3]_4_[2]_11_[1]_16_[2]_2
9_[4]_3_[4]_3_[4]_9_[4]_4_[4]_7_[1]_12_[3]_4_[2]_9_[1]_[2]_2
14_[4]_30_[3]_4_[2]_5_[1]_19_[1]_3_[1]_8_[2]_10_[1]_4_[1]_3_[1]_2

So for the first example line I would like an output line that looks like: [4][4][1][1][3][2][1][2].

I can easily delete the square brackets with:

:%s/\[\d\]//g

but I am having real trouble trying to delete all text that doesn't match [/d]. Most vim commands that work with negation (e.g. :v) appear to only operate on the whole line rather than individual strings, and using %s with group matching:

:%s/\v(.*)([\d])(.*)/\2

also matches and deletes the square brackets.

Would someone have a suggestion to solve my problem?

like image 490
Steffen Avatar asked Jul 21 '15 16:07

Steffen


People also ask

How do I substitute in Vim?

Press y to replace the match or l to replace the match and quit. Press n to skip the match and q or Esc to quit substitution. The a option substitutes the match and all remaining occurrences of the match. To scroll the screen down, use CTRL+Y , and to scroll up, use CTRL+E .

What command is used to replace a string with another string in vi editor?

After running a search once, you can repeat it by using n in command mode, or N to reverse direction. When you want to search for a string of text and replace it with another string of text, you can use the syntax :[range]s/search/replace/.

How do I match a pattern in Vim?

In normal mode, press / to start a search, then type the pattern ( \<i\> ), then press Enter. If you have an example of the word you want to find on screen, you do not need to enter a search pattern. Simply move the cursor anywhere within the word, then press * to search for the next occurrence of that whole word.

How do you replace special characters in vi?

You can use all the special matching characters for searches in search-and-replace. Then press the Return key. Then and press Return. You can modify this command to halt the search and make vi query whether you want to make the replacement in each instance.


1 Answers

You were close. You need to quote the square brackets and use something far less greedy than .*.

:%s/\v[^[]*(\[\d\])[^[]*/\1/g

Overview

Match leading text + [ + digit + ] + trailing text. Capturing the [ + digit + ]. Replace the match the capture group. Leaving only the brackets and digits.

Glory of details

  • Using \v for very magic. See :h magic
  • [...] is a bracketed character classes which matches any of the characters inside. e.g. fooba[rs] matches foobar and foobas, but not foobaz. See :h /\[. (Note Vim may call this this a collection.)
  • [^...] is an negated bracketed character class, so matches none of the charcters inside the brackets. e.g. fooba[^rz] matches foobas, but not foobaz and foobar.
  • [^[] - match any non-[ character. (This looks funny)
  • [^[]* - match are non-[ character zero or more times. This will match the leading text we want to remove.
  • (...) - capture group
  • \[ & \] represent literal [ / ]. We must escape to prevent a character class.
  • \d match 1 digit.
  • [^[]* - match trailing text to be removed
  • \1 the replacement will be our capture group aka bracketed digits.
  • Use the g flag to do this globally or more plainly multiple times.
  • Use a range of % to do a substitution, :s, over the entire file, 1,$.

So why does :%s/\v(.*)([\d])(.*)/\2 fail?

tl;dr: Your pattern doesn't match. Try /[\d].

Long version:

  • The first .* will capture too much leaving only the last portion. e.g. [2]....
  • [\d]creates a bracketed character class that matches one of the following characters: d or \
  • The second .* suffers from the same problem as the first when using the g flag.
  • Why not 3 capture groups? You can certainly have more capture groups, but in this case they unnecessary, so remove them.
  • Missing g flag. This means the command will only do 1 substitution per line which will leave plenty of text.

General regex and substitution advice

When working with a tricky regex pattern it is often best to start with a search, /, instead of a substitution. This allows you to see where the matches are beforehand. You can tweak your search via / and pressing <up> or <c-p>. Or even better use q/ to open the command-line-window so you edit your pattern like editing any text. You can also use <c-f> while on the command line (including /) to bring up the command-line-window.

Once you have your pattern then you want to start your substitution. Vim provides a shortcut for using the current search by using an empty pattern. e.g :%s//\1/g.

This technique especially combined with set incsearch and set hlsearch, means you can see your matches interactively before you do your substitutions. This technique is shown in the following Vimcast episode: Refining search patterns with the command-line window.

Need to learn more regex syntax? See :h pattern. It is a very long and dense read, but will greatly aid you in the future. I also find reading Perl's regex documentation via perldoc perlre to be a good place to look as well. Note: Perl's regexes are different from Vim's regexes (See :h perl-patterns), but Perl Compatible Regular Expressions (PCRE) are very common.

Thoughts

You may also consider grep -o. e.g. %!grep -o '\[\d\]'.

More help

:h :s
:h range
:h magic
:h /\[
:h /\(
:h s/\1
:h /\d
:h :s_flags
:h 'hlsearch'
:h 'incsearch'
:h q/
:h command-line-window
:h :range!
like image 134
Peter Rincker Avatar answered Oct 11 '22 16:10

Peter Rincker