I'm trying to switch the threading in my code to multiprocessing to measure its performance and hopefully achieve better brute-forcing potential as my program is meant to brute-force password protected .zip files. But whenever I try to run the program I get this:
BruteZIP2.py -z "Generic ZIP.zip" -f Worm.txt
Traceback (most recent call last):
File "C:\Users\User\Documents\Jetbrains\PyCharm\BruteZIP\BruteZIP2.py", line 40, in <module>
main(args.zip, args.file)
File "C:\Users\User\Documents\Jetbrains\PyCharm\BruteZIP\BruteZIP2.py", line 34, in main
p.start()
File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: cannot serialize '_io.BufferedReader' object
I did find threads that had the same issue as I did but they were both unanswered/unsolved. I also tried inserting Pool
above p.start()
as I believe this was caused due to the fact that I am on a Windows-based machine but it was no help. My code is as follows:
import argparse
from multiprocessing import Process
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack using either a word list, password list or a dictionary.", usage="BruteZIP.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()
def extract_zip(zip_file, password):
try:
zip_file.extractall(pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
print(f"Incorrect password: {password.decode('utf-8')}")
# pass
def main(zip, file):
if (zip == None) | (file == None):
# If the args are not used, it displays how to use them to the user.
print(parser.usage)
exit(0)
zip_file = zipfile.ZipFile(zip)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
for line in txt_file:
password = line.strip()
p = Process(target=extract_zip, args=(zip_file, password))
p.start()
p.join()
if __name__ == '__main__':
# BruteZIP.py -z zip.zip -f file.txt.
main(args.zip, args.file)
As I said before, I believe this is happening mainly because I am on a Windows-based machine right now. I shared my code with a few others who were on Linux based machines and they had no problem running the code above.
My main goal here is to get 8 processes/pools started to maximize the number of attempts done compared to threading, but due to the fact that I cannot get a fix for TypeError: cannot serialize '_io.BufferedReader' object
message I am unsure on what to do here and how can I go on to fix it. Any assistance would be appreciated.
File handles don't serialize very well... But you could send the name of the zip file instead of the zip filehandle (a string serializes okay between processes). And avoid zip
for your filename as it's a built-in. I've chosen zip_filename
p = Process(target=extract_zip, args=(zip_filename, password))
then:
def extract_zip(zip_filename, password):
try:
zip_file = zipfile.ZipFile(zip_filename)
zip_file.extractall(pwd=password)
The other problem is that your code won't run in parallel because of this:
p.start()
p.join()
p.join
waits for the process to finish... hardly useful. You have to store the process identifiers to join
them in the end.
This may cause other problems: creating too many processes in parallel may be an issue for your machine and won't help much after some point. Consider a multiprocessing.Pool
instead, to limit the number of workers.
Trivial example is:
with multiprocessing.Pool(5) as p:
print(p.map(f, [1, 2, 3, 4, 5, 6, 7]))
Adapted to your example:
with multiprocessing.Pool(5) as p:
p.starmap(extract_zip, [(zip_filename,line.strip()) for line in txt_file])
(starmap
expands the tuples as 2 separate arguments to fit your extract_zip
method, as explained in Python multiprocessing pool.map for multiple arguments)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With