Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I understand os.walk right?

Tags:

python

file

The loop for root, dir, file in os.walk(startdir) works through these steps?

for root in os.walk(startdir)      for dir in root          for files in dir 
  1. get root of start dir : C:\dir1\dir2\startdir

  2. get folders in C:\dir1\dir2\startdir and return list of folders "dirlist"

  3. get files in the first dirlist item and return the list of files "filelist" as the first item of a list of filelists.

  4. move to the second item in dirlist and return the list of files in this folder "filelist2" as the second item of a list of filelists. etc.

  5. move to the next root in the folder tree and start from 2. etc.

Right? Or does it just get all roots first, then all dirs second, and all files third?

like image 210
Baf Avatar asked Jun 12 '12 00:06

Baf


People also ask

How does os Walk () work in Python?

walk() work in python ? OS. walk() generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).

Is os walk a depth first?

os. walk() will traverse this directory tree using the depth-first search algorithm.

Is os Walk ordered?

The unmodified order of the values is undefined by os. walk , meaning that it will be "any" order. You should not rely on what you experience today. But in fact it will probably be what the underlying file system returns.

How Fast Is os walk?

In practice, removing all those extra system calls makes os. walk() about 8-9 times as fast on Windows, and about 2-3 times as fast on POSIX systems. So we're not talking about micro- optimizations. See more benchmarks here.


1 Answers

os.walk returns a generator, that creates a tuple of values (current_path, directories in current_path, files in current_path).

Every time the generator is called it will follow each directory recursively until no further sub-directories are available from the initial directory that walk was called upon.

As such,

os.walk('C:\dir1\dir2\startdir').next()[0] # returns 'C:\dir1\dir2\startdir' os.walk('C:\dir1\dir2\startdir').next()[1] # returns all the dirs in 'C:\dir1\dir2\startdir' os.walk('C:\dir1\dir2\startdir').next()[2] # returns all the files in 'C:\dir1\dir2\startdir' 

So

import os.path .... for path, directories, files in os.walk('C:\dir1\dir2\startdir'):      if file in files:           print('found %s' % os.path.join(path, file)) 

or this

def search_file(directory = None, file = None):     assert os.path.isdir(directory)     for cur_path, directories, files in os.walk(directory):         if file in files:             return os.path.join(directory, cur_path, file)     return None 

or if you want to look for file you can do this:

import os def search_file(directory = None, file = None):     assert os.path.isdir(directory)     current_path, directories, files = os.walk(directory).next()     if file in files:         return os.path.join(directory, file)     elif directories == '':         return None     else:         for new_directory in directories:             result = search_file(directory = os.path.join(directory, new_directory), file = file)             if result:                 return result         return None 
like image 178
Samy Vilar Avatar answered Sep 25 '22 18:09

Samy Vilar