I have a big file contains many lines in the following format,
<SomeString1>Key1</SomeString>
<SomeString2>Key2</SomeString>
<SomeString3>Key3</SomeString>
...
I want to remove the tags, and the output should look like,
Key1
Key2
Key3
...
Algorithmically, I should write something like:
For all lines:
Remove all string before character `>`
Remove all string after character `</`
To delete a word, position the cursor at the beginning of the word and type dw . The word and the space it occupied are removed. To delete part of a word, position the cursor on the word to the right of the part to be saved. Type dw to delete the rest of the word.
The command dw will delete from the current cursor position to the beginning of the next word character. The command d$ (note, that's a dollar sign, not an 'S') will delete from the current cursor position to the end of the current line. D (uppercase D) is a synonym for d$ (lowercase D + dollar sign). Save this answer.
Simply use a replace regex:
:%s/<[^>]*>//g
This will apply the s
(substitution) command for each line (%
) and remove all <...>
sequences for the entire line (g
).
There are many situations in which these commands come in handy, especially using regex. You can find more information about it here.
These two commands should do the trick:
:%s/<\w*>//
:%s/<\/\w*>//
The first replaces all the opening tags with nothing. The second replaces all the closing tags with nothing. <\w*>
matches any number of alphanumeric characters between <
and >
and <\/\w*>
matches any number of alphanumeric characters between </
and >
.
Edit: a simpler way:
:%s/<.\{-}>//g
Note that this:
:%s/<.*>//g
Won't work because the *
is "greedy" and will match the whole line. \{-}
is the non-greedy equivalent. Read more about greediness here: http://vimregex.com/#Non-Greedy
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With