Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoiding infinite recursion with os.walk

I'm using os.walk with followlinks=True, but I hit a place where a symbolic link refers to it's own directory, causing an infinite loop. The culprit in this case is /usr/bin/X11 which list listed as follow :

lrwxrwxrwx 1 root root           1 Apr 24  2015 X11 -> .

Is there any way to avoid following links to either . or .. which I would assume, would cause similar problems? I think I could check this with os.readlink then compare against the current path. Is there any other solution for this?

like image 568
Eric Avatar asked May 02 '16 07:05

Eric


People also ask

Is OS walk recursive?

Use os. walk() to recursively traverse a directory For each subdirectory in the directory tree, os.

Is OS walk slow?

Python's built-in os. walk() is significantly slower than it needs to be, because – in addition to calling os. listdir() on each directory – it executes the stat() system call or GetFileAttributes() on each file to determine whether the entry is a directory or not.

What does OS Walk () do?

OS. walk() generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).

How do you stop infinite recursion in Python?

setrecursionlimit(limit) does is: Set the maximum depth of the Python interpreter stack to limit. This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python.


2 Answers

There is no way to avoid storing a set of all the directories visited, if you want to avoid recursion. You do not need to use readlink, however, you can just store inodes. This avoids the problem of path canonicalization altogether.

import os
dirs = set()
for dirpath, dirnames, filenames in os.walk('.', followlinks=True):
    st = os.stat(dirpath)
    scandirs = []
    for dirname in dirnames:
        st = os.stat(os.path.join(dirpath, dirname))
        dirkey = st.st_dev, st.st_ino
        if dirkey not in dirs:
            dirs.add(dirkey)
            scandirs.append(dirname)
    dirnames[:] = scandirs
    print(dirpath)
like image 183
Dietrich Epp Avatar answered Oct 12 '22 08:10

Dietrich Epp


To completely avoid the problem of infinite recursion (with links pointing to where ever) you need to store the files and/or directories you already visited.

The people from pynotify module had the same issue and used the described method. The patch is in the link ;)

like image 20
salomonderossi Avatar answered Oct 12 '22 07:10

salomonderossi