Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I print out the count of unique matches with grep?

Lets say I have millions of packets to look through and I want to see how many times a packet was sent to a certain port number.

Here are some of the packets:

10:27:46.227407 IP 85.130.236.26.54156 > 139.91.133.120.60679: tcp 0
10:27:46.337038 IP 211.142.173.14.80 > 139.91.138.125.56163: tcp 0
10:27:46.511241 IP 211.49.224.217.3389 > 139.91.131.47.6973: tcp 0

I want to look through the 2nd port number here so:

60679, 53163, 6973, etc

So I can use:

grep -c '\.80:' output.txt

To count all the times port 80 was used. But is there a way for it to display all the ports that were used and how many times it was found in this file. Something like this and preferable sorted too so I can see which ports were used most often:

.80: - 54513
.110: - 12334
.445: - 412
like image 615
Dragonfly Avatar asked Apr 24 '12 15:04

Dragonfly


People also ask

How do you count the number of matches in grep?

Using grep -c alone will count the number of lines that contain the matching word instead of the number of total matches. The -o option is what tells grep to output each match in a unique line and then wc -l tells wc to count the number of lines. This is how the total number of matching words is deduced.

Which of the following options in grep command is used to count all the lines that match the regular expression?

Options Description -c : This prints only a count of the lines that match a pattern -h : Display the matched lines, but do not display the filenames. -i : Ignores, case for matching -l : Displays list of a filenames only. -n : Display the matched lines and their line numbers.


1 Answers

See uniq -c. You'll want to pull out the bit you want, sort the result, pipe thru uniq, sort the output. Something like this maybe:

egrep '\.[0-9]+:' output.txt | sort | uniq -c | sort -nr

Clarification: I've used grep here because it's not clear what your output.txt format looks like, but you'll want to actually cut out the port number bit, perhaps via cut or awk.

Edit: To get the port, you can cut once on a period and then again on a colon:

cut -d. -f10 < output.txt | cut -d: -f1

(Or any one of a dozen other ways to accomplish the same thing.) That will give you an unsorted list of ports. Then:

cut -d. -f10 < output.txt | cut -d: -f1 | sort | uniq -c | sort -nr
like image 166
Alex Howansky Avatar answered Oct 03 '22 19:10

Alex Howansky