Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to recursively go through all subdirectories and read files?

Tags:

python

file

I have a root-ish directory containing multiple subdirectories, all of which contain a file name data.txt. What I would like to do is write a script that takes in the "root" directory, and then reads through all of the subdirectories and reads every "data.txt" in the subdirectories, and then writes stuff from every data.txt file to an output file.

Here's a snippet of my code:

import os import sys rootdir = sys.argv[1]  with open('output.txt','w') as fout:     for root, subFolders, files in os.walk(rootdir):         for file in files:             if (file == 'data.txt'):                 #print file                 with open(file,'r') as fin:                     for lines in fin:                         dosomething() 

My dosomething() part -- I've tested and confirmed for it to work if I am running that part just for one file. I've also confirmed that if I tell it to print the file instead (the commented out line) the script prints out 'data.txt'.

Right now if I run it Python gives me this error:

File "recursive.py", line 11, in <module>     with open(file,'r') as fin: IOError: [Errno 2] No such file or directory: 'data.txt' 

I'm not sure why it can't find it -- after all, it prints out data.txt if I uncomment the 'print file' line. What am I doing incorrectly?

like image 301
Joe Avatar asked Nov 26 '12 18:11

Joe


People also ask

How can I recursively grep through subdirectories?

Grep command is used to search text from files. It is a versatile pattern that invokes grep with –r. –R option search files recursively from subdirectories, starting from the current directory.

How do I list files in all subdirectories?

By default, ls lists just one directory. If you name one or more directories on the command line, ls will list each one. The -R (uppercase R) option lists all subdirectories, recursively.

Is Iterdir recursive?

iterdir. Because iterdir is non-recursive, it only lists the immediate contents of mydir and not the contents of subdirectories (like a1. html ). This will list the resolved absolute path of each item instead of just the filenames.


1 Answers

You need to use absolute paths, your file variable is just a local filename without a directory path. The root variable is that path:

with open('output.txt','w') as fout:     for root, subFolders, files in os.walk(rootdir):         if 'data.txt' in files:             with open(os.path.join(root, 'data.txt'), 'r') as fin:                 for lines in fin:                     dosomething() 
like image 200
Martijn Pieters Avatar answered Oct 13 '22 21:10

Martijn Pieters