Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How would you group a number of lines?

Tags:

bash

Let's say you have the following.

192.168.0.100
192.168.0.100
192.168.0.100
192.168.0.102
192.168.0.102
192.168.0.100

That's considered 3 unique hits. The way to distinguish it is that consecutive identical IPs count as one. How would you loop through the file and count accordingly?

like image 800
sdot257 Avatar asked Aug 03 '10 22:08

sdot257


2 Answers

If your uniq is like mine, and works only similar strings in sequence, just don't sort before your uniq:

file foo.txt:

192.168.0.100
192.168.0.100
192.168.0.100
192.168.0.102
192.168.0.102
192.168.0.100

And:

$ cat foo.txt | uniq -c

edit: can I give myself a useless use of cat award?

$ uniq -c foo.txt

/edit
Output:

  3 192.168.0.100
  2 192.168.0.102
  1 192.168.0.100
like image 100
Wrikken Avatar answered Oct 09 '22 17:10

Wrikken


I would avoid using bash for this. Use a real language like Python, awk or even Perl.

Python

#!/usr/bin/env python 
from __future__ import print_function
import fileinput
def combine( source ):
    count, prev= 1, source.next()
    for line in source:
        if line == prev:
            count += 1
        else:
            yield count, prev
            count, prev = 1, line
    yield count, prev
 for count, text in combine( fileinput.input() ):
    print( count, text )

Simple and extremely fast compared to bash.

Since this reads from stdin and writes to stdout, you can use it as a simple command in a pipeline.

like image 22
S.Lott Avatar answered Oct 09 '22 19:10

S.Lott