Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find a file in python

Tags:

python

People also ask

How do I find a file in Python?

walk() function to find our file. The findfile() function takes the file's name and the root path as input parameters and returns the path of our specified file. This approach gives us the absolute path of the file.

How do I search for a filename in Python?

Python can search for file names in a specified path of the OS. This can be done using the module os with the walk() functions. This will take a specific path as input and generate a 3-tuple involving dirpath, dirnames, and filenames. In the below example we are searching for a file named smpl.

How do I find the location of a file in Python?

To retrieve a file in Python, you need to know the exact path to reach the file, in Windows, you can view a particular file's path by right-clicking the File-> Properties-> General-> Location. Similarly, to run a script, the working directory needs to be set to the directory containing the script.

What is __ file __ in Python?

The __file__ variable: __file__ is a variable that contains the path to the module that is currently being imported. Python creates a __file__ variable for itself when it is about to import a module.


os.walk is the answer, this will find the first match:

import os

def find(name, path):
    for root, dirs, files in os.walk(path):
        if name in files:
            return os.path.join(root, name)

And this will find all matches:

def find_all(name, path):
    result = []
    for root, dirs, files in os.walk(path):
        if name in files:
            result.append(os.path.join(root, name))
    return result

And this will match a pattern:

import os, fnmatch
def find(pattern, path):
    result = []
    for root, dirs, files in os.walk(path):
        for name in files:
            if fnmatch.fnmatch(name, pattern):
                result.append(os.path.join(root, name))
    return result

find('*.txt', '/path/to/dir')

In Python 3.4 or newer you can use pathlib to do recursive globbing:

>>> import pathlib
>>> sorted(pathlib.Path('.').glob('**/*.py'))
[PosixPath('build/lib/pathlib.py'),
 PosixPath('docs/conf.py'),
 PosixPath('pathlib.py'),
 PosixPath('setup.py'),
 PosixPath('test_pathlib.py')]

Reference: https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob

In Python 3.5 or newer you can also do recursive globbing like this:

>>> import glob
>>> glob.glob('**/*.txt', recursive=True)
['2.txt', 'sub/3.txt']

Reference: https://docs.python.org/3/library/glob.html#glob.glob


I used a version of os.walk and on a larger directory got times around 3.5 sec. I tried two random solutions with no great improvement, then just did:

paths = [line[2:] for line in subprocess.check_output("find . -iname '*.txt'", shell=True).splitlines()]

While it's POSIX-only, I got 0.25 sec.

From this, I believe it's entirely possible to optimise whole searching a lot in a platform-independent way, but this is where I stopped the research.


If you are using Python on Ubuntu and you only want it to work on Ubuntu a substantially faster way is the use the terminal's locate program like this.

import subprocess

def find_files(file_name):
    command = ['locate', file_name]

    output = subprocess.Popen(command, stdout=subprocess.PIPE).communicate()[0]
    output = output.decode()

    search_results = output.split('\n')

    return search_results

search_results is a list of the absolute file paths. This is 10,000's of times faster than the methods above and for one search I've done it was ~72,000 times faster.