I have a command (cmd1) that greps through a log file to filter out a set of numbers. The numbers are in random order, so I use sort -gr to get a reverse sorted list of numbers. There may be duplicates within this sorted list. I need to find the count for each unique number in that list.
For e.g. if the output of cmd1 is:
100 100 100 99 99 26 25 24 24
I need another command that I can pipe the above output to, so that, I get:
100 3 99 2 26 1 25 1 24 2
The uniq command has a convenient -c option to count the number of occurrences in the input file. This is precisely what we're looking for. However, one thing we must keep in mind is that the uniq command with the -c option works only when duplicated lines are adjacent.
If you want to count duplicates for a given element then use the count() function. Use a counter() function or basics logic combination to find all duplicated elements in a list and count them in Python.
The uniq command in Linux is used to display identical lines in a text file. This command can be helpful if you want to remove duplicate words or strings from a text file. Since the uniq command matches adjacent lines for finding redundant copies, it only works with sorted text files.
how about;
$ echo "100 100 100 99 99 26 25 24 24" \ | tr " " "\n" \ | sort \ | uniq -c \ | sort -k2nr \ | awk '{printf("%s\t%s\n",$2,$1)}END{print}'
The result is :
100 3 99 2 26 1 25 1 24 2
uniq -c
works for GNU uniq 8.23 at least, and does exactly what you want (assuming sorted input).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With