Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Making os.walk work in a non-standard way

Tags:

python

os.walk

I'm trying to do the following, in this order:

Use os.walk() to go down each directory.
Each directory has subfolders, but I'm only interested in the first subfolder. So the directory looks like:

/home/RawData/SubFolder1/SubFolder2

For example. I want, in RawData2, to have folders that stop at the SubFolder1 level.

The thing is, it seems like os.walk() goes down through ALL of the RawData folder, and I'm not certain how to make it stop.

The below is what I have so far - I've tried a number of other combinations of substituting variable dirs for root, or files, but that doesn't seem to get me what I want.

import os 

for root, dirs, files in os.walk("/home/RawData"): 

    os.chdir("/home/RawData2/")
    make_path("/home/RawData2/"+str(dirs))
like image 806
Z R Avatar asked Oct 17 '15 16:10

Z R


2 Answers

I suggest you use glob instead.

As the help on glob describes:

glob(pathname)
    Return a list of paths matching a pathname pattern.

    The pattern may contain simple shell-style wildcards a la
    fnmatch. However, unlike fnmatch, filenames starting with a
    dot are special cases that are not matched by '*' and '?'
    patterns.

So, your pattern is every first level directory, which I think would be something like this:

/root_path/*/sub_folder1/sub_folder2

So, you start at your root, get everything in that first level, and then look for sub_folder1/sub_folder2. I think that works.

To put it all together:

from glob import glob

dirs = glob('/root_path/*/sub_folder1/sub_folder2')

# Then iterate for each path
for i in dirs:
    print(i)
like image 60
idjaw Avatar answered Oct 20 '22 00:10

idjaw


Beware: Documentation for os.walk says:

don’t change the current working directory between resumptions of walk(). walk() never changes the current directory, and assumes that its caller doesn’t either

so you should avoid os.chdir("/home/RawData2/") in the walk loop.

You can easily ask walk not to recurse by using topdown=True and clearing dirs:

for root, dirs, files in os.walk("/home/RawData", True):
    for rep in dirs:
        make_path(os.join("/home/RawData2/", rep )
        # add processing here
    del dirs[]  # tell walk not to recurse in any sub directory
like image 28
Serge Ballesta Avatar answered Oct 20 '22 01:10

Serge Ballesta