Count number of values of one column group by value of another column

Question

I have a text file like this :

asn|prefix|ip|domain
25008|85.192.184.0/21|85.192.184.59|solusi-it.com
25008|85.192.184.0/21|85.192.184.59|samtimes.ru
131755|103.31.224.0/24|103.31.224.58|karosel-ind.com
131755|103.31.224.0/24|103.31.224.58|solusi-it.com
9318|1.232.0.0/13|1.234.91.168|solusi-it.com
9318|1.232.0.0/13|1.234.91.168|es350.co.kr

Is there a way that I can count number of unique ips on a unique domain with Linux Bash command and get a result like this?

domain|count_ip
solusi-it.com|3
samtimes.ru|1
karosel-ind.com|1
es350.co.kr|1

Gilles Quenot · Accepted Answer

With perl :

perl -F'\|' -lane '                                                            
    $. > 1 and $domains->{$F[3]}->{$F[2]}++;
    END{
        print "domain|count_ip";
        print $_, "|", scalar keys %{ $domains->{$_} } for keys %$domains;
    }
' file | tee new_file

The idea behind this is to use a HASH of HASH

$domains->{$F[3]}->{$F[2]}++

the $F[3] is the domain and $F[2] is the IP. Uniqueness is guarantee. A HASH key is always unique.

OUTPUT:

domain|count_ip
es350.co.kr|1
karosel-ind.com|1
samtimes.ru|1
solusi-it.com|3

fredtantini · Answer

Using awk:

~$ awk -F'|' 'NR>1{a[$NF]++}END{print "domain|count_ip";for (i in a){print i FS a[i]}}' f
domain|count_ip
karosel-ind.com|1
solusi-it.com|3
samtimes.ru|1
es350.co.kr|1

You can use Field separator to have fields separated with |.
This won't check if the ip is already in the array a though.

In order to do that, you could use sort to test uniqueness of 3rd and 4th field:

~$ cat f f >f2
~$ sort -t'|' -k3,4 -u f2 | awk -F'|' 'NR>1{a[$NF]++}END{print "domain|count_ip";for (i in a){print i FS a[i]}}'
domain|count_ip
solusi-it.com|3
samtimes.ru|1
es350.co.kr|1
domain|1

Count number of values of one column group by value of another column

Tags:

linux

bash

count

UserYmY

2 Answers

OUTPUT:

Gilles Quenot

fredtantini

Recent Activity

Donate For Us

Count number of values of one column group by value of another column

Tags:

linux

bash

count

UserYmY

2 Answers

OUTPUT:

Gilles Quenot

fredtantini

Related questions

Recent Activity

Donate For Us