I have this input file (1=active, 0=inactive)
a 1
a 0
b 1
b 1
b 0
c 0
c 0
c 0
c 0
.
.
.
And want output like this:
X repeats active count inactive count
a 2 times 1 1
b 3 times 2 1
c 4 times 0 4
I tried:
awk -F "," '{if ($2==1) a[$1]++; } END { for (i in a); print i, a[i] }'file name
But that did not work.
How can I get the output?
Just to give you an idea this awk should work:
awk '$2{a[$1]++; next} {b[$1]++; if (!($1 in a)) a[$1]=0} END{for (i in a) print i, a[i], b[i], (a[i]+b[i])}' file
a 1 1 2
b 2 1 3
c 0 4 4
You can format the output way you want.
You can try
awk -f r.awk input.txt
where input.awk
is your data file, and r.awk
is
{
X[$1]++
if ($2) a[$1]++
else ia[$1]++
}
END {
printf "X\tRepeat\tActive\tInactive\n"
for (i in X) {
printf "%s\t%d\t%d\t%d\n", i, X[i], a[i], ia[i]
}
}
awk '{a[$1]++; if ($2!=0) {b[$1]++;c[$1]+=0} else {c[$1]++;b[$1]+=0}}END {for (i in a) print i, a[i], b[i], c[i]}' file
Here is another simple way to do it with awk
awk '{a[$1]++;b[$1]+=$2} END { for (i in a) print i,a[i],b[i],a[i]-b[i]}' file
a 2 1 1
b 3 2 1
c 4 0 4
No test is needed, just sum the column $2 and this gives number of hits.
awk '
{ repeats[$1]++; counts[$1,$2]++ }
END {
for (key in repeats)
print key, repeats[key], counts[key,1]+0, counts[key,0]+0
}
' file
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With