Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing duplicates in grep output

Tags:

grep

bash

I have a case where i got a results file with the following pattern:

path:pattern found

for example

./user/home/file1:this is a game

in other words when i searched for some string i got the file and the line it found it.

Problem is sometimes i have multiple cases in the same file so i would like to remove the duplicates files (the cases would be different so it's not possible).

Any help or ideas are appreciated :)

End results is to turn this:

/user/home/desktop/file1:this is a game
/user/home/desktop/file1:what kind of game
/user/home/desktop/file1:fast action game

into just the first results found without losing all the rest of the data in the file.

Update1:

So the actual file looks like this:

/user/home/desktop/file1:this is a game
/user/home/desktop/file1:what kind of game
/user/home/desktop/file1:fast action game
/user/home/desktop/file2:a game
/user/home/desktop/file3:of game
/user/home/desktop/file4:fast game

i'm looking to get rid of the multiple occurrences in the same file so it should look like this:

/user/home/desktop/file1:this is a game
/user/home/desktop/file2:a game
/user/home/desktop/file3:of game
/user/home/desktop/file4:fast game
like image 485
john mas Avatar asked Mar 16 '18 04:03

john mas


People also ask

How do I remove duplicates from a text file in Linux?

To remove duplicate lines from a sorted file and make it unique, we use the uniq command in the Linux system. The uniq command work as a kind of filter program that reports out the duplicate lines in a file. It filters adjacent matching lines from the input and gives a unique output.

How do I remove duplicates in a text file in Unix?

Remove duplicate lines with uniq If you don't need to preserve the order of the lines in the file, using the sort and uniq commands will do what you need in a very straightforward way. The sort command sorts the lines in alphanumeric order. The uniq command ensures that sequential identical lines are reduced to one.

How do you remove duplicates in Unix?

The uniq command in UNIX is a command line utility for reporting or filtering repeated lines in a file. It can remove duplicates, show a count of occurrences, show only repeated lines, ignore certain characters and compare on specific fields.


1 Answers

You could use sort -u:

grep pattern files | sort -t: -u -k1,1
  • -t: - use : as the delimiter
  • -k1,1 - sort based on the first field only
  • -u - removed duplicates (based on the first field)

This will retain just one occurrence of files, removing any duplicates.

For your example, this is the output you get:

/user/home/desktop/file1:this is a game

In case you are looking for multiple distinct matches with a file, then:

grep pattern files | sort -u
like image 53
codeforester Avatar answered Nov 11 '22 12:11

codeforester