Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicate entries in a Bash script [duplicate]

Tags:

bash

shell

People also ask

How do I remove duplicate lines in Linux?

The uniq command is used to remove duplicate lines from a text file in Linux. By default, this command discards all but the first of adjacent repeated lines, so that no output lines are repeated.

How do you sort and remove duplicates in bash?

Uniq is a command used to find out the unique lines from the given input (stdin or from filename as command argument) by eliminating the duplicates. It can also be used to find out the duplicate lines from the input. Uniq can be applied only for sorted data input.

Which command is used to remove the duplicate records in file?

Uniq command is helpful to remove or detect duplicate entries in a file.


You can sort then uniq:

$ sort -u input.txt

Or use awk:

$ awk '!a[$0]++' input.txt

It deletes duplicate, consecutive lines from a file (emulates "uniq").
First line in a set of duplicate lines is kept, rest are deleted.

sed '$!N; /^\(.*\)\n\1$/!P; D'

Perl one-liner similar to @kev's awk solution:

perl -ne 'print if ! $a{$_}++' input

This variation removes trailing whitespace before comparing:

perl -lne 's/\s*$//; print if ! $a{$_}++' input

This variation edits the file in-place:

perl -i -ne 'print if ! $a{$_}++' input

This variation edits the file in-place, and makes a backup input.bak

perl -i.bak -ne 'print if ! $a{$_}++' input