Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete consecutive duplicate lines using unix utilities

This sounds simple on its face but is actually somewhat more complex. I would like to use a unix utility to delete consecutive duplicates, leaving the original. But, I would also like to preserve other duplicates that do not occur immediately after the original. For example, if we have the lines:

O B 
O B 
C D 
T V
O B

I want the output to be:

O B 
C D
T V
O B 

Although the first and last lines are the same, they are not consecutive and therefore I want to keep them as unique entries.

like image 585
z.rubi Avatar asked Apr 06 '18 19:04

z.rubi


People also ask

How do I remove duplicate lines in Unix?

Remove duplicate lines with uniq If you don't need to preserve the order of the lines in the file, using the sort and uniq commands will do what you need in a very straightforward way. The sort command sorts the lines in alphanumeric order. The uniq command ensures that sequential identical lines are reduced to one.

How do I remove duplicate lines in Linux?

To remove duplicate lines from a sorted file and make it unique, we use the uniq command in the Linux system. The uniq command work as a kind of filter program that reports out the duplicate lines in a file. It filters adjacent matching lines from the input and gives a unique output.

How do you find repeated lines in Unix?

The uniq command in Linux is used to display identical lines in a text file. This command can be helpful if you want to remove duplicate words or strings from a text file. Since the uniq command matches adjacent lines for finding redundant copies, it only works with sorted text files.

Which command is used to remove the duplicate records in file in Unix?

Uniq command is helpful to remove or detect duplicate entries in a file.


1 Answers

You can do:

cat file1 | uniq > file2

or more succinctly:

uniq file1 file2

assuming file1 contains

O B
O B
C D
T V
O B

For more details, see man uniq. In particular, note that the uniq command accepts two arguments with the following syntax: uniq [OPTION]... [INPUT [OUTPUT]].

Finally if you'd want to remove all duplicates (and sort the file along the way), you could do:

sort -u file1 > file2
like image 50
ErikMD Avatar answered Oct 31 '22 08:10

ErikMD