Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

counting duplicates in a sorted sequence using command line tools

I have a command (cmd1) that greps through a log file to filter out a set of numbers. The numbers are in random order, so I use sort -gr to get a reverse sorted list of numbers. There may be duplicates within this sorted list. I need to find the count for each unique number in that list.

For e.g. if the output of cmd1 is:

100  100  100  99  99  26  25  24  24 

I need another command that I can pipe the above output to, so that, I get:

100     3 99      2 26      1 25      1 24      2 
like image 803
letronje Avatar asked Jul 07 '09 13:07

letronje


People also ask

How do I count duplicate lines in Linux?

The uniq command has a convenient -c option to count the number of occurrences in the input file. This is precisely what we're looking for. However, one thing we must keep in mind is that the uniq command with the -c option works only when duplicated lines are adjacent.

How do you count duplicates in a list?

If you want to count duplicates for a given element then use the count() function. Use a counter() function or basics logic combination to find all duplicated elements in a list and count them in Python.

How do I find duplicates in a text file in Linux?

The uniq command in Linux is used to display identical lines in a text file. This command can be helpful if you want to remove duplicate words or strings from a text file. Since the uniq command matches adjacent lines for finding redundant copies, it only works with sorted text files.


2 Answers

how about;

$ echo "100 100 100 99 99 26 25 24 24" \     | tr " " "\n" \     | sort \     | uniq -c \     | sort -k2nr \     | awk '{printf("%s\t%s\n",$2,$1)}END{print}' 

The result is :

100 3 99  2 26  1 25  1 24  2 
like image 76
Stephen Paul Lesniewski Avatar answered Oct 11 '22 11:10

Stephen Paul Lesniewski


uniq -c works for GNU uniq 8.23 at least, and does exactly what you want (assuming sorted input).

like image 27
Ibrahim Avatar answered Oct 11 '22 11:10

Ibrahim