Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I traverse a file system with a generator?

I'm trying to create a utility class for traversing all the files in a directory, including those within subdirectories and sub-subdirectories. I tried to use a generator because generators are cool; however, I hit a snag.


def grab_files(directory):
    for name in os.listdir(directory):
        full_path = os.path.join(directory, name)
        if os.path.isdir(full_path):
            yield grab_files(full_path)
        elif os.path.isfile(full_path):
            yield full_path
        else:
            print('Unidentified name %s. It could be a symbolic link' % full_path)

When the generator reaches a directory, it simply yields the memory location of the new generator; it doesn't give me the contents of the directory.

How can I make the generator yield the contents of the directory instead of a new generator?

If there's already a simple library function to recursively list all the files in a directory structure, tell me about it. I don't intend to replicate a library function.

like image 889
Evan Kroske Avatar asked Nov 09 '09 01:11

Evan Kroske


3 Answers

Why reinvent the wheel when you can use os.walk

import os
for root, dirs, files in os.walk(path):
    for name in files:
        print os.path.join(root, name)

os.walk is a generator that yields the file names in a directory tree by walking the tree either top-down or bottom-up

like image 170
Nadia Alramli Avatar answered Nov 08 '22 16:11

Nadia Alramli


I agree with the os.walk solution

For pure pedantic purpose, try iterate over the generator object, instead of returning it directly:


def grab_files(directory):
    for name in os.listdir(directory):
        full_path = os.path.join(directory, name)
        if os.path.isdir(full_path):
            for entry in grab_files(full_path):
                yield entry
        elif os.path.isfile(full_path):
            yield full_path
        else:
            print('Unidentified name %s. It could be a symbolic link' % full_path)
like image 23
thebat Avatar answered Nov 08 '22 14:11

thebat


As of Python 3.4, you can use the glob() method from the built-in pathlib module:

import pathlib
p = pathlib.Path('.')
list(p.glob('**/*'))    # lists all files recursively
like image 14
EinfachToll Avatar answered Nov 08 '22 14:11

EinfachToll