Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3. Need to write to a file, check to see if a line exist, then write to the file again

I recently recovered a ton pictures from a friend's dead hard drive and I decided to wanted to write a program in python to:

Go through all the files

Check their MD5Sum

Check to see if the MD5Sum exists in a text file

If it does, let me know with "DUPLICATE HAS BEEN FOUND"

If it doesn't, add the MD5Sum to the text file.

The ultimate goal being to delete all duplicates. However, when I run this code, I get the following:

Traceback (most recent call last):
  File "C:\Users\godofgrunts\Documents\hasher.py", line 16, in <module>
    for line in myfile:
io.UnsupportedOperation: not readable

Am I doing this completely wrong or am I just misunderstanding something?

import hashlib
import os
import re

rootDir = 'H:\\recovered'
hasher = hashlib.md5()


with open('md5sums.txt', 'w') as myfile:
        for dirName, subdirList, fileList in os.walk(rootDir):            
                for fname in fileList:
                        with open((os.path.join(dirName, fname)), 'rb') as pic:
                                buf = pic.read()
                                hasher.update(buf)
                        md5 = str(hasher.hexdigest())
                        for line in myfile:
                                if re.search("\b{0}\b".format(md5),line):
                                        print("DUPLICATE HAS BEEN FOUND")
                                else:
                                        myfile.write(md5 +'\n')
like image 838
godofgrunts Avatar asked Sep 26 '13 01:09

godofgrunts


2 Answers

You have opened your file in writing mode ('w') In your with statement. To open it both writing and reading mode, do:

with open('md5sums.txt', 'w+') as myfile:
like image 73
TerryA Avatar answered Oct 11 '22 17:10

TerryA


The correct mode is "r+", not "w+".

http://docs.python.org/3.3/tutorial/inputoutput.html#reading-and-writing-files

like image 23
aenda Avatar answered Oct 11 '22 16:10

aenda