I wrote a scripts that logs mac addresses from pcapy into mysql through SQLAlchemy, I initially used straight sqlite3 but soon realized that something better was required, so this weekend that past I rewrote all the database talk to comply with SQLAlchemy. All works fine, data goes in and comes out again. I though the sessionmaker() would be very useful to manage all the sessions to the DB for me.
I see a strange occurrence with regards to memory consumption. I start the script... it collects and writes all to DB... but for every 2-4seconds I have a Megabyte in size increase in memory consumption. At the moment I'm talking about very few records, sub-100 rows.
Script Sequence:
if true? only write timestamp to timestamp column where mac = newmac. back to Step 2.
if false? then write new mac to DB. clear maclist[] and call step 2 again.
After 1h30m I have a memory footprint of 1027MB (RES) and 1198MB (VIRT) with 124 rows in the 1 table database (MySQL).
Q: Could this be contributed to the maclist[] being cleaned and repopulated from DB everytime?
Q: Whats going to happen when it reaches system Max memory?
Any ideas or advice would be great thanks.
memory_profiler output for the segment in question where list[] gets populated from database mac_addr column.
Line # Mem usage Increment Line Contents
================================================
123 1025.434 MiB 0.000 MiB @profile
124 def sniffmgmt(p):
125 global __mac_reel
126 global _blacklist
127 1025.434 MiB 0.000 MiB stamgmtstypes = (0, 2, 4)
128 1025.434 MiB 0.000 MiB tmplist = []
129 1025.434 MiB 0.000 MiB matching = []
130 1025.434 MiB 0.000 MiB observedclients = []
131 1025.434 MiB 0.000 MiB tmplist = populate_observed_list()
132 1025.477 MiB 0.043 MiB for i in tmplist:
133 1025.477 MiB 0.000 MiB observedclients.append(i[0])
134 1025.477 MiB 0.000 MiB _mac_address = str(p.addr2)
135 1025.477 MiB 0.000 MiB if p.haslayer(Dot11):
136 1025.477 MiB 0.000 MiB if p.type == 0 and p.subtype in stamgmtstypes:
137 1024.309 MiB -1.168 MiB _timestamp = atimer()
138 1024.309 MiB 0.000 MiB if p.info == "":
139 1021.520 MiB -2.789 MiB _SSID = "hidden"
140 else:
141 1024.309 MiB 2.789 MiB _SSID = p.info
142
143 1024.309 MiB 0.000 MiB if p.addr2 not in observedclients:
144 1018.184 MiB -6.125 MiB db_add(_mac_address, _timestamp, _SSID)
145 1018.184 MiB 0.000 MiB greetings()
146 else:
147 1024.309 MiB 6.125 MiB add_time(_mac_address, _timestamp)
148 1024.309 MiB 0.000 MiB observedclients = [] #clear the list
149 1024.309 MiB 0.000 MiB observedclients = populate_observed_list() #repopulate the list
150 1024.309 MiB 0.000 MiB greetings()
You will see observedclients is the list in question.
I managed to find the actual cause to the memory consumption. It was scapy itself. Scapy by default is set to store all packets it captures. But you can disable it.
Disable:
sniff(iface=interface, prn=sniffmgmt, store=0)
Enable:
sniff(iface=interface, prn=sniffmgmt, store=1)
Thanks to BitBucket Ticket
As you can see profiler output suggests you use less memory by the end, so this is not representative of your situation.
Some directions to dig deeper: 1) add_time (why is it increasing memory usage?) 2) db_add (why is it decreasing memory usage? caching? closing/opening db connection? what happens in case of failure?) 3) populate_observed_list (is return value safe for garbage collection? may be there are some packets for which certain exception occurs?)
Also, what happens if you sniff more packets than your code is able to process do to performance?
I would profile these 3 functions and analyze possible exceptions/failures.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With