I am using "import re and sys"
On the terminal, when I type "1.py a.txt" I want it to read "a.txt", which has these content:
17:18:42.525964 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 1:1449, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 1448
17:18:42.526623 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 1449:2897, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 1448
17:18:42.526900 IP 192.168.0.15.34436 > 66.185.85.146.80: Flags [.], ack 2897, win 1444, options [nop,nop,TS val 43002448 ecr 3551057710], length 0
17:18:42.527694 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 2897:14481, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 11584
17:18:42.527716 IP 192.168.0.15.34436 > 66.185.85.146.80: Flags [.], ack 14481, win 1444, options [nop,nop,TS val 43002448 ecr 3551057710], length 0
17:18:42.528794 IP 66.185.85.146.80 > 192.168.0.15.34436: Flags [.], seq 14481:23169, ack 2555, win 1320, options [nop,nop,TS val 3551057710 ecr 43002332], length 8688
17:18:42.528813 IP 192.168.0.15.34436 > 66.185.85.146.80: Flags [.], ack 23169, win 1444, options [nop,nop,TS val 43002448 ecr 3551057710], length 0
17:18:42.545191 IP 192.168.0.15.60030 > 52.2.63.29.80: Flags [.], seq 4113773418:4113774866, ack 850072640, win 270, options [nop,nop,TS val 43002452 ecr 9849626], length 1448
then use regex, to remove everything but the ip addresses and the length(total), and print it out as:
source: 66.185.85.146 dest: 192.168.0.15 total:1448
source: 66.185.85.146 dest: 192.168.0.15 total:1448
source: 192.168.0.15 dest: 66.185.85.146 total:0
but if there are duplicates, then it will read as follows, where it will add the total amounts of the duplicates:
source: 66.185.85.146 dest: 192.168.0.15 total:2896
source: 192.168.0.15 dest: 66.185.85.146 total:0
Furthermore, if i type "-s" in the terminal like so:
"1.py -s a.txt"
or
"1.py a.txt -s 192.168.0.15"
it should sort, for the first -s, it will sort and print the content, and if -s ip, then sort the ips.
currently this is what I have for each item, I want to know how to use them all together.
#!/usr/bin/python3
import re
import sys
file = sys.argv[1]
a = open(file, "r")
for line in a:
line = line.rstrip()
c = re.findall(r'^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$',line) #Yes I know its not the best regex for this, but I am testing it out for now
d = re.findall(r'\b(\d+)$\b',line)
if len(c) > 0 and len(d) > 0:
print("source:", c[0],"\t","dest:",c[1],"\t", "total:",d[0])
That is what I have so far, I do not know how to use the "-s" or how to sort, as well as how to remove the duplicates, and add the totals when duplicates are removed.
A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. RegEx can be used to check if a string contains the specified search pattern.
what you need is ArgumentParser
for your -s
parameter, so something like:
import argparse
...
def main():
parser = argparse.ArgumentParser()
parser.add_argument('-s', '--sort', action='append',
help='sort specific IP')
parser.add_argument('-s2', '--sortall', action='store_true',
help='sort all the IPs')
args = parser.parse_args()
if args.sortall:
# store all Ips
for ip in args.sort:
# store by ip
if __name__ == '__main__':
main()
now you can use the script like:
1.py a.txt -s 192.168.0.15
or
1.py a.txt -s2
apart from that, on how to put all together, looks like a homework, so you should read more about python to figure it out.
To read the -s
you probably want a library to parse the arguments, like the standard argparse
. It allows you to specify which arguments your script requires, and their descriptions, and it parses them and ensure their format.
To sort a list there's the sorted(my_list)
function.
Finally, to ensure there are no duplicates you can use a set
. This loses the list ordering, but since you are sorting it later it shouldn't be a problem.
Alternatively, there's the Counter
collection made specifically to add grouped values and sort them.
from collections import Counter
results = Counter()
for line in a:
line = line.rstrip()
c = re.findall(r'^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$',line) #Yes I know its not the best regex for this, but I am testing it out for now
d = re.findall(r'\b(\d+)$\b',line)
if len(c) > 0 and len(d) > 0:
source, destination, length = c[0], c[1], d[0]
results[(source, destination)] += int(length)
# Print the sorted items.
for (source, destination), length in results.most_common():
print("source:", source, "\t", "dest:", destination, "\t", "total:", length)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With