I have a C++/Obj-C background and I am just discovering Python (been writing it for about an hour). I am writing a script to recursively read the contents of text files in a folder structure.
The problem I have is the code I have written will only work for one folder deep. I can see why in the code (see #hardcoded path
), I just don't know how I can move forward with Python since my experience with it is only brand new.
Python Code:
import os import sys rootdir = sys.argv[1] for root, subFolders, files in os.walk(rootdir): for folder in subFolders: outfileName = rootdir + "/" + folder + "/py-outfile.txt" # hardcoded path folderOut = open( outfileName, 'w' ) print "outfileName is " + outfileName for file in files: filePath = rootdir + '/' + file f = open( filePath, 'r' ) toWrite = f.read() print "Writing '" + toWrite + "' to" + filePath folderOut.write( toWrite ) f.close() folderOut.close()
Use os. walk . It recursively walks into directory and subdirectories, and already gives you separate variables for files and directories.
To traverse the directory in Python, use the os. walk() function. The os. walk() function accepts four arguments and returns 3-tuple, including dirpath, dirnames, and filenames.
Make sure you understand the three return values of os.walk
:
for root, subdirs, files in os.walk(rootdir):
has the following meaning:
root
: Current path which is "walked through"subdirs
: Files in root
of type directoryfiles
: Files in root
(not in subdirs
) of type other than directoryAnd please use os.path.join
instead of concatenating with a slash! Your problem is filePath = rootdir + '/' + file
- you must concatenate the currently "walked" folder instead of the topmost folder. So that must be filePath = os.path.join(root, file)
. BTW "file" is a builtin, so you don't normally use it as variable name.
Another problem are your loops, which should be like this, for example:
import os import sys walk_dir = sys.argv[1] print('walk_dir = ' + walk_dir) # If your current working directory may change during script execution, it's recommended to # immediately convert program arguments to an absolute path. Then the variable root below will # be an absolute path as well. Example: # walk_dir = os.path.abspath(walk_dir) print('walk_dir (absolute) = ' + os.path.abspath(walk_dir)) for root, subdirs, files in os.walk(walk_dir): print('--\nroot = ' + root) list_file_path = os.path.join(root, 'my-directory-list.txt') print('list_file_path = ' + list_file_path) with open(list_file_path, 'wb') as list_file: for subdir in subdirs: print('\t- subdirectory ' + subdir) for filename in files: file_path = os.path.join(root, filename) print('\t- file %s (full path: %s)' % (filename, file_path)) with open(file_path, 'rb') as f: f_content = f.read() list_file.write(('The file %s contains:\n' % filename).encode('utf-8')) list_file.write(f_content) list_file.write(b'\n')
If you didn't know, the with
statement for files is a shorthand:
with open('filename', 'rb') as f: dosomething() # is effectively the same as f = open('filename', 'rb') try: dosomething() finally: f.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With