Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loop through folders in Python and for files containing strings

Tags:

python

I am very new to python. I need to iterate through the subdirectories of a given directory and return all files containing a certain string.

for root, dirs, files in os.walk(path):
    for name in files:
        if name.endswith((".sql")):
            if 'gen_dts' in open(name).read():
                print name

This was the closest I got.

The syntax error I get is

Traceback (most recent call last):
  File "<pyshell#77>", line 4, in <module>
    if 'gen_dts' in open(name).read():
IOError: [Errno 2] No such file or directory: 'dq_offer_desc_bad_pkey_vw.sql'

The 'dq_offer_desc_bad_pkey_vw.sql' file does not contain 'gen_dts' in it.

I appreciate the help in advance.

like image 262
user3264602 Avatar asked Aug 06 '15 22:08

user3264602


People also ask

How do I loop through a folder?

To loop through a directory, and then print the name of the file, execute the following command: for FILE in *; do echo $FILE; done.


1 Answers

You're getting that error because you're trying to open name, which is just the file's name, not it's full relative path. What you need to do is open(os.path.join(root, name), 'r') (I added the mode since it's good practice).

for root, dirs, files in os.walk(path):
    for name in files:
        if name.endswith('.sql'):
            filepath = os.path.join(root, name)
            if 'gen_dts' in open(filepath, 'r').read():
                print filepath

os.walk() returns a generator that gives you tuples like (root, dirs, files), where root is the current directory, and dirs and files are the names of the directories and files, respectively, that are in the root directory. Note that they are the names, not the paths; or to be precise, they're the path of that directory/file relative to the current root directory, which is another way of saying the same thing. Another way to think of it is that the directories and files in dirs and files will never have slashes in them.

One final point; the root directory paths always begin with the path that you pass to os.walk(), whether it was relative to your current working directory or not. So, for os.walk('three'), the root in the first tuple will be 'three' (for os.walk('three/'), it'll be 'three/'). For os.walk('../two/three'), it'll be '../two/three'. For os.walk('/one/two/three/'), it'll be '/one/two/three/'; the second one might be '/one/two/three/four'.

like image 89
Cyphase Avatar answered Oct 29 '22 16:10

Cyphase