Does anyone know how to write a live data sniffer in Python which extracts the originating IP address and the full URL that was being accessed? I have looked at pulling data from urlsnarf however IPv6 is not supported (and the connections will be to IPv6 hosts).
While I can pull data from tcpdump and greping for GET/POST that would leave me with simply the path on the webserver, and I would not obtain the associated FQDN. Unfortunately using SQUID w/ IPv6 TPROXY is not an option due to the configuration of the environment.
Does anyone have any ideas on how to do this with Python bindings for libpcap? Your help would be most appreciated :)
Thanks :)
Unfortunately, with IPv6 you are stuck doing your own TCP re-assembly. The good news that you are only concerned with URL data which should (generally) be in one or two packets.
You should be able to get away with using pylibpcap to do this. You'll want to use setfilter on your pcap object to make sure you are only looking at TCP traffic. As you move forward in your pcap loop you'll apply some HTTP regular expressions to the payload. If you have what looks like HTTP traffic go ahead and try to parse the header to get at the URL data. Hopefully, you'll get full URL with a line break before the end of the packet. If not, you are going to have to do some lightweight TCP reassembly.
Oh, and you'll want to use socket.inet_ntop and socket.getaddrinfo to print out info about the IPv6 host.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With