Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete non-contiguous duplicate lines in vi without sorting?

Tags:

shell

vi

unix

I know how to remove contiguous duplicates in vi. Either

:%!uniq 

or

:g/^\(.*\)$\n\1$/d). 

But I have a file which has data in a random order and there are some duplicate lines which are scattered all over the file. How do I remove all these duplicates without disturbing the order of lines? The first unique line should be kept and the next(or rest all) duplicate should be removed?

E.g. cat file1

Here's looking at you, Kid.
Casablanca 
Here's looking at you, Kid.
Go ahead, make my day. 
Dirty Harry
sleep 5
Go ahead, make my day. 
Yippee-ki-yay

Output should be:

Here's looking at you, Kid.
Casablanca 
Go ahead, make my day. 
Dirty Harry
sleep 5
Yippee-ki-yay
like image 813
Ajim Bagwan Avatar asked Mar 21 '23 09:03

Ajim Bagwan


1 Answers

There is one awk liner very handful for this:

$ awk '!a[$0]++' file
Here's looking at you, Kid.
Casablanca 
Go ahead, make my day. 
Dirty Harry
sleep 5
Yippee-ki-yay

It keeps track of the lines processed in the array a[]. Whenever the line comes again, the counter is already positive so that the condition is false and the line is not printed.

If you want to run it in vim, do:

:%!awk '\!a[$0]++'
        ^^
       you have to escape the ! to be treated properly
like image 90
fedorqui 'SO stop harming' Avatar answered Apr 06 '23 10:04

fedorqui 'SO stop harming'