I have been trying to get the unique values in each column of a tab delimited file in bash. So, I used the following command.
cut -f <column_number> <filename> | sort | uniq -c
It works fine and I can get the unique values in a column and its count like
105 Linux
55 MacOS
500 Windows
What I want to do is instead of sorting by the column value names (which in this example are OS names) I want to sort them by count and possibly have the count in the second column in this output format. So It will have to look like:
Windows 500
MacOS 105
Linux 55
How do I do this?
Use:
cut -f <col_num> <filename>
| sort
| uniq -c
| sort -r -k1 -n
| awk '{print $2" "$1}'
The sort -r -k1 -n
sorts in reverse order, using the first field as a numeric value. The awk
simply reverses the order of the columns. You can test the added pipeline commands thus (with nicer formatting):
pax> echo '105 Linux
55 MacOS
500 Windows' | sort -r -k1 -n | awk '{printf "%-10s %5d\n",$2,$1}'
Windows 500
Linux 105
MacOS 55
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With