I have a text file with 10 columns say f.txt which looks like below:
aab abb  263-455
aab abb  263-455
aab abb  263-455
bbb abb  26-455
bbb abb  26-455
bbb aka  264-266
bga bga  230-232
bga bga  230-232
I want to count the unique number of each string in the first and second columns based on the numbers of third column.
Output:
aab - 1
abb - 2
bbb - 2
aka - 1
bga - 2
Total no - 8
                awk '
       !s[1":"$1":"$3]++{sU[$1]++;tot++} 
       !s[2":"$2":"$3]++{sU[$2]++;tot++} 
       END{
         for (x in sU) print x, sU[x]; 
         print "Total No -",tot;
       }' input
Output
bga 1
aab 1
bbb 2
aka 1
bga 1
abb 2
Total No - 8
                        This will do the trick:
$ awk '!a[$0]++{c[$1]++;c[$2]++}
       END{for(k in c){print k" - "c[k];s+=c[k]}print "\nTotal No -",s}' file
aka - 1
bga - 2
aab - 1
abb - 2
bbb - 2
Total No - 8
In the more readable script form:
!lines[$0]++{
    count[$1]++
    count[$2]++
}
END {
    for (line in count) {
        print line" - "count[line]
        sum += count[line]
    }
    print "\nTotal No -",sum
}
To run it in this form save it to a file script.awk and:
$ awk -f script.awk file
aka - 1
bga - 2
aab - 1
abb - 2
bbb - 2
Total No - 8
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With