Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Directory recursion and symlinks

If you recursively traverse a directory tree by the obvious method, you'll run into trouble with infinite recursion when a symlink points to a parent directory.

An obvious solution would be to just check for symlinks and not follow them at all. But that might be an unpleasant surprise for a user who doesn't expect what behaves for other purposes like a perfectly normal directory to be silently ignored.

An alternative solution might be to keep a hash table of all directories visited so far, and use this to check for loops. But this would require there to be some canonical representation, some way to get the identity, of the directory you are currently looking at (regardless of the path by which you reached it).

Would Unix users typically regard the second solution as less surprising?

If so, is there a way to obtain such a canonical representation/identity of a directory, that's portable across Unix systems? (I'd like it to work across Linux, BSD, Mac OS, Solaris etc. I expect to have to write separate code for Windows.)

like image 720
rwallace Avatar asked Sep 11 '11 10:09

rwallace


3 Answers

The most frequently ignored API in this field would be

nftw

Nftw has options to avoid it traversing symlinks. It has much more advanced capabilities than that. Here is a simple sample from the man page itself:

#define _XOPEN_SOURCE 500
#include <ftw.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>

static int
display_info(const char *fpath, const struct stat *sb,
             int tflag, struct FTW *ftwbuf)
{
    printf("%-3s %2d %7jd   %-40s %d %s\n",
           (tflag == FTW_D) ?   "d"   : (tflag == FTW_DNR) ? "dnr" :
           (tflag == FTW_DP) ?  "dp"  : (tflag == FTW_F) ?   "f" :
           (tflag == FTW_NS) ?  "ns"  : (tflag == FTW_SL) ?  "sl" :
           (tflag == FTW_SLN) ? "sln" : "???",
           ftwbuf->level, (intmax_t) sb->st_size,
           fpath, ftwbuf->base, fpath + ftwbuf->base);
    return 0;           /* To tell nftw() to continue */
}

int
main(int argc, char *argv[])
{
    int flags = 0;

    if (argc > 2 && strchr(argv[2], 'd') != NULL)
        flags |= FTW_DEPTH;
    if (argc > 2 && strchr(argv[2], 'p') != NULL)
        flags |= FTW_PHYS;

    if (nftw((argc < 2) ? "." : argv[1], display_info, 20, flags)
            == -1)
    {
        perror("nftw");
        exit(EXIT_FAILURE);
    }
    exit(EXIT_SUCCESS);
}

See also

  • Directory recursion
  • http://rosettacode.org/wiki/Walk_a_directory/Recursively
like image 152
sehe Avatar answered Oct 17 '22 15:10

sehe


The absolute path of the directory is such a representation. You can get it with the realpath function, which is defined in the POSIX standard, so it will work on any POSIX-compliant system. See man 3 realpath.

like image 35
Michał Wojciechowski Avatar answered Oct 17 '22 16:10

Michał Wojciechowski


Not only symlinks, but hard-links as well. Not very common, but not forbidden. (Only root can hardlink directories) The only thing that is canonical is {device_number, inode_number}. But network filesystems can misbehave.

like image 34
wildplasser Avatar answered Oct 17 '22 16:10

wildplasser