Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove Duplicate Line in Vim?

Tags:

regex

vim

xml

I'm trying to use VIM to remove a duplicate line in an XML file I created. (I can't recreate the file because the ID numbers will change.)

The file looks something like this:

    <tag k="natural" v="water"/>
    <tag k="nhd:fcode" v="39004"/>
    <tag k="natural" v="water"/>

I'm trying to remove one of the duplicate k="natural" v="water" lines. When I try to use the \_ modifier to include newlines in my regex replaces, VIM doesn't seem to find anything.

Any tips on what regex or tool to use?

like image 862
magneticMonster Avatar asked Dec 13 '09 15:12

magneticMonster


People also ask

How do I remove duplicate lines in Linux?

Remove duplicate lines with uniq If you don't need to preserve the order of the lines in the file, using the sort and uniq commands will do what you need in a very straightforward way. The sort command sorts the lines in alphanumeric order. The uniq command ensures that sequential identical lines are reduced to one.

How do I use uniquify lines in Vim?

:%! sort | uniq -u will do just that: sort, remove all lines that are not unique, and leave the result in the file.

How do I remove duplicate characters in a string in SQL?

It is done by using a Tally number table. The logic is to split the characters into different rows and select minimum value for each value so that duplicates will be removed and concatenate them. Now execute the above procedure by passing the string value. The result is abc12.


1 Answers

First of all, you can use awk to remove all duplicate lines, keeping their order.

:%!awk '\!_[$0]++'

If you not sure if there are some other duplicate lines you don't want remove, then just add conditions.

:%!awk '\!(_[$0]++ && /tag/ && /natural/ && /water/)'

But, parsing a nested structure like xml with regex is a bad idea, IMHO. You are going to care them not to be screwed up all the time. xmllint gives you a list of specific elements in the file:

:!echo "cat //tag[@k='natural' and @v='water']" | xmllint --shell %

You can slash duplicate lines step by step.

like image 50
ernix Avatar answered Sep 19 '22 03:09

ernix