Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting glob to follow symlinks in Python

Suppose I have a subdirectory of symlinks that looks like the following:

subdir/
    folder/
        readme.txt
    symlink/ => ../hidden/
hidden/
    readme.txt

If I run the following code:

>>> from pathlib import Path
>>> list(Path('./subdir/').glob('**/readme.txt'))

I would expect the outcome to be:

subdir/folder/readme.txt
subdir/symlink/readme.txt

But the actual result is:

subdir/folder/readme.txt

I found out that this is because (for some undocumented reason) the ** operator doesn't follow symlinks.

Is there a way to change this configuration pragmatically?

like image 756
bashaus Avatar asked Oct 02 '17 16:10

bashaus


People also ask

Does glob follow symlinks?

glob does not follow symlinks - Python tracker. This issue tracker has been migrated to GitHub, and is currently read-only. For more information, see the GitHub FAQs in the Python's Developer Guide.

How do I use glob in Python?

We can use the function glob. glob() or glob. iglob() directly from glob module to retrieve paths recursively from inside the directories/files and subdirectories/subfiles. Note: When recursive is set True “ ** ” followed by path separator ('./**/') will match any files or directories.

What is glob glob () in Python?

Python glob. glob() method returns a list of files or folders that matches the path specified in the pathname argument. This function takes two arguments, namely pathname, and recursive flag. pathname : Absolute (with full path and the file name) or relative (with UNIX shell-style wildcards).

What type of object does glob return?

glob (short for global) is used to return all file paths that match a specific pattern. We can use glob to search for a specific file pattern, or perhaps more usefully, search for files where the filename matches a certain pattern by using wildcard characters.


1 Answers

pathlib.glob also doesn't work for me with ** and symlinks. I've found related issue https://bugs.python.org/issue33428.

As an alternative for Python3 you could use glob.glob with ** and recursive=True option (see details https://docs.python.org/3/library/glob.html)

In [67]: from glob import glob
In [71]: list(glob("./**/readme.txt", recursive=True))
Out[71]:
['./hidden/readme.txt',
 './subdir/folder/readme.txt',
 './subdir/symlink/readme.txt']

In [73]: list(glob("./**/readme.txt", recursive=False))
Out[73]: ['./hidden/readme.txt']

Compare to:

In [72]: list(Path('.').glob('**/readme.txt'))
Out[72]: [PosixPath('hidden/readme.txt'), PosixPath('subdir/folder/readme.txt')]
like image 121
Roman Chernyatchik Avatar answered Oct 27 '22 12:10

Roman Chernyatchik