this is my first question asked here at stackoverflow and am really looking forward to being part of this community. I am new to program and python was the most recommended first program by many people.
Anyways. I have a log file which looks like this:
"No.","Time","Source","Destination","Protocol","Info"
"1","0.000000","120.107.103.180","172.16.112.50","TELNET","Telnet Data ..."
"2","0.000426","172.16.112.50","172.16.113.168","TELNET","Telnet Data ..."
"3","0.019849","172.16.113.168","172.16.112.50","TCP","21582 > telnet [ACK]"
"4","0.530125","172.16.113.168","172.16.112.50","TELNET","Telnet Data ..."
"5","0.530634","172.16.112.50","172.16.113.168","TELNET","Telnet Data ..."
And I wanted to parse the log file using Python to make it look like this as the result:
From IP 135.13.216.191 Protocol Count: (IMF 1) (SMTP 38) (TCP 24) (Total: 63)
I would really like some help on what path to take to tackle this problem should I use lists and loop through it or dictionaries/tuples?
Thanks in advance for your help!
You can parse the file using the csv
module:
import csv
with open('logfile.txt') as logfile:
for row in csv.reader(logfile):
no, time, source, dest, protocol, info = row
# do stuff with these
I can't quite tell what you're asking, but I think you want:
import csv
from collections import defaultdict
# A dictionary whose values are by default (a
# dictionary whose values are by default 0)
bySource = defaultdict(lambda: defaultdict(lambda: 0))
with open('logfile.txt') as logfile:
for row in csv.DictReader(logfile):
bySource[row["Source"]][row["Protocol"]] += 1
for source, protocols in bySource.iteritems():
protocols['Total'] = sum(protocols.values())
print "From IP %s Protocol Count: %s" % (
source,
' '.join("(%s: %d)" % item for item in protocols.iteritems())
)
I would begin by first reading the file into a list:
contents = []
with open("file_path") as f:
contents = f.readlines()
Then you can split each line into a list of it's own:
ips = [l[1:-1].split('","') for l in contents]
We can then map these into a dict:
sourceIps = {}
for ip in ips:
try:
sourceIps[ip[2]].append(ip)
except:
sourceIps[ip[2]] = [ip]
And finally print out the result:
for ip, stuff in sourceIps.iteritems():
print "From {0} ... ".format(ip, ...)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With