Let's say you have the following.
192.168.0.100
192.168.0.100
192.168.0.100
192.168.0.102
192.168.0.102
192.168.0.100
That's considered 3 unique hits. The way to distinguish it is that consecutive identical IPs count as one. How would you loop through the file and count accordingly?
If your uniq
is like mine, and works only similar strings in sequence, just don't sort before your uniq
:
file foo.txt:
192.168.0.100
192.168.0.100
192.168.0.100
192.168.0.102
192.168.0.102
192.168.0.100
And:
$ cat foo.txt | uniq -c
edit: can I give myself a useless use of cat award?
$ uniq -c foo.txt
/edit
Output:
3 192.168.0.100
2 192.168.0.102
1 192.168.0.100
I would avoid using bash for this. Use a real language like Python, awk or even Perl.
Python
#!/usr/bin/env python
from __future__ import print_function
import fileinput
def combine( source ):
count, prev= 1, source.next()
for line in source:
if line == prev:
count += 1
else:
yield count, prev
count, prev = 1, line
yield count, prev
for count, text in combine( fileinput.input() ):
print( count, text )
Simple and extremely fast compared to bash.
Since this reads from stdin and writes to stdout, you can use it as a simple command in a pipeline.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With