Using the "re" i compile the datas of a handshake like this:
piece_request_handshake = re.compile('13426974546f7272656e742070726f746f636f6c(?P<reserved>\w{16})(?P<info_hash>\w{40})(?P<peer_id>\w{40})')
handshake = piece_request_handshake.findall(hex_data)
Then i print it
I'm unable to add image because i'm new so this is the output:
root@debian:/home/florian/Téléchargements# python script.py
[('0000000000100005', '606d4759c464c8fd0d4a5d8fc7a223ed70d31d7b', '2d5452323532302d746d6e6a657a307a6d687932')]
My question is, how can i take only the second piece of this data that is to say the "hash_info" (the "606d47...") ?
I already tried with the group of re with the following line:
print handshake.group('info_hash')
But the result is an error (sorry again i can't show the screen...):
*root@debian:/home/florian/Téléchargements# python script.py
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "script.py", line 122, in run
self.p.dispatch(0, PieceRequestSniffer.cb)
File "script.py", line 82, in cb
print handshake.group('info_hash')
AttributeError: 'list' object has no attribute 'group'*
This is the start of my full code for the curious:
import pcapy
import dpkt
from threading import Thread
import re
import binascii
import socket
import time
liste=[]
prefix = '13426974546f7272656e742070726f746f636f6c'
hash_code = re.compile('%s(?P<reserved>\w{16})(?P<info_hash>\w{40})(?P<peer_id>\w{40})' % prefix)
match = hash_code.match()
piece_request_handshake = re.compile('13426974546f7272656e742070726f746f636f6c(?P<aaa>\w{16})(?P<bbb>\w{40})(?P<ccc>\w{40})')
piece_request_tcpclose = re.compile('(?P<start>\w{12})5011')
#-----------------------------------------------------------------INIT------------------------------------------------------------
class PieceRequestSniffer(Thread):
def __init__(self, dev='eth0'):
Thread.__init__(self)
self.expr = 'udp or tcp'
self.maxlen = 65535 # max size of packet to capture
self.promiscuous = 1 # promiscuous mode?
self.read_timeout = 100 # in milliseconds
self.max_pkts = -1 # number of packets to capture; -1 => no limit
self.active = True
self.p = pcapy.open_live(dev, self.maxlen, self.promiscuous, self.read_timeout)
self.p.setfilter(self.expr)
@staticmethod
def cb(hdr, data):
eth = dpkt.ethernet.Ethernet(str(data))
ip = eth.data
#------------------------------------------------------IPV4 AND TCP PACKETS ONLY---------------------------------------------------
#Select Ipv4 packets because of problem with the .p in Ipv6
if eth.type == dpkt.ethernet.ETH_TYPE_IP6:
return
else:
#Select only TCP protocols
if ip.p == dpkt.ip.IP_PROTO_TCP:
tcp = ip.data
src_ip = socket.inet_ntoa(ip.src)
dst_ip = socket.inet_ntoa(ip.dst)
fin_flag = ( tcp.flags & dpkt.tcp.TH_FIN ) != 0
#if fin_flag:
#print "TH_FIN src:%s dst:%s" % (src_ip,dst_ip)
try:
#Return hexadecimal representation
hex_data = binascii.hexlify(tcp.data)
except:
return
#-----------------------------------------------------------HANDSHAKE-------------------------------------------------------------
handshake = piece_request_handshake.findall(hex_data)
if handshake and (src_ip+" "+dst_ip) not in liste and (dst_ip+" "+src_ip) not in liste and handshake != '':
liste.append(src_ip+" "+dst_ip)
print match.group('info_hash')
re.findall()
returns a list of tuples, each containing the matching strings that correspond to the named groups in the re pattern. This example (using a simplified pattern) demonstrates that you can access the required item with indexing:
import re
prefix = 'prefix'
pattern = re.compile('%s(?P<reserved>\w{4})(?P<info_hash>\w{10})(?P<peer_id>\w{10})' % prefix)
handshake = 'prefix12341234567890ABCDEF1234' # sniffed data
match = pattern.findall(handshake)
>>> print match
[('1234', '1234567890', 'ABCDEF1234')]
>>> info_hash = match[0][1]
>>> print info_hash
1234567890
But the point of named groups is to provide a way to access the matched values for a named group by name. You can use re.match()
instead:
import re
prefix = 'prefix'
pattern = re.compile('%s(?P<reserved>\w{4})(?P<info_hash>\w{10})(?P<peer_id>\w{10})' % prefix)
handshake = 'prefix12341234567890ABCDEF1234' # sniffed data
match = pattern.match(handshake)
>>> print match
<_sre.SRE_Match object at 0x7fc201efe918>
>>> print match.group('reserved')
1234
>>> print match.group('info_hash')
1234567890
>>> print match.group('peer_id')
ABCDEF1234
The values are also available using dictionary access:
>>> d = match.groupdict()
>>> d
{'peer_id': 'ABCDEF1234', 'reserved': '1234', 'info_hash': '1234567890'}
>>> d['info_hash']
'1234567890'
Finally, if there are multiple handshake sequences in the input data, you can use re.finditer()
:
import re
prefix = 'prefix'
pattern = re.compile('%s(?P<reserved>\w{4})(?P<info_hash>\w{10})(?P<peer_id>\w{10})' % prefix)
handshake = 'blahprefix12341234567890ABCDEF1234|randomjunkprefix12349876543210ABCDEF1234,more random junkprefix1234hellothereABCDEF1234...' # sniffed data
for match in pattern.finditer(handshake):
print match.group('info_hash')
Output:
1234567890 9876543210 hellothere
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With