Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check whether a file (from the list of file names) exist or not in Python? [duplicate]

How do I check whether a file exists or not, without using the try statement?

like image 464
spence91 Avatar asked Sep 17 '08 12:09

spence91


9 Answers

If the reason you're checking is so you can do something like if file_exists: open_it(), it's safer to use a try around the attempt to open it. Checking and then opening risks the file being deleted or moved or something between when you check and when you try to open it.

If you're not planning to open the file immediately, you can use os.path.isfile

Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.

import os.path
os.path.isfile(fname) 

if you need to be sure it's a file.

Starting with Python 3.4, the pathlib module offers an object-oriented approach (backported to pathlib2 in Python 2.7):

from pathlib import Path

my_file = Path("/path/to/file")
if my_file.is_file():
    # file exists

To check a directory, do:

if my_file.is_dir():
    # directory exists

To check whether a Path object exists independently of whether is it a file or directory, use exists():

if my_file.exists():
    # path exists

You can also use resolve(strict=True) in a try block:

try:
    my_abs_path = my_file.resolve(strict=True)
except FileNotFoundError:
    # doesn't exist
else:
    # exists
like image 77
rslite Avatar answered Sep 17 '22 15:09

rslite


Use os.path.exists to check both files and directories:

import os.path
os.path.exists(file_path)

Use os.path.isfile to check only files (note: follows symlinks):

os.path.isfile(file_path)
like image 31
PierreBdR Avatar answered Sep 19 '22 15:09

PierreBdR


Unlike isfile(), exists() will return True for directories. So depending on if you want only plain files or also directories, you'll use isfile() or exists(). Here is some simple REPL output:

>>> os.path.isfile("/etc/password.txt")
True
>>> os.path.isfile("/etc")
False
>>> os.path.isfile("/does/not/exist")
False
>>> os.path.exists("/etc/password.txt")
True
>>> os.path.exists("/etc")
True
>>> os.path.exists("/does/not/exist")
False
like image 38
bortzmeyer Avatar answered Sep 20 '22 15:09

bortzmeyer


import os

if os.path.isfile(filepath):
   print("File exists")
like image 40
Paul Avatar answered Sep 17 '22 15:09

Paul


Use os.path.isfile() with os.access():

import os

PATH = './file.txt'
if os.path.isfile(PATH) and os.access(PATH, os.R_OK):
    print("File exists and is readable")
else:
    print("Either the file is missing or not readable")
like image 31
Yugal Jindle Avatar answered Sep 19 '22 15:09

Yugal Jindle


import os
os.path.exists(path) # Returns whether the path (directory or file) exists or not
os.path.isfile(path) # Returns whether the file exists or not
like image 38
benefactual Avatar answered Sep 20 '22 15:09

benefactual


Although almost every possible way has been listed in (at least one of) the existing answers (e.g. Python 3.4 specific stuff was added), I'll try to group everything together.

Note: every piece of Python standard library code that I'm going to post, belongs to version 3.5.3.

Problem statement:

  1. Check file (arguable: also folder ("special" file) ?) existence
  2. Don't use try / except / else / finally blocks

Possible solutions:

  1. [Python 3]: os.path.exists(path) (also check other function family members like os.path.isfile, os.path.isdir, os.path.lexists for slightly different behaviors)

    os.path.exists(path)
    

    Return True if path refers to an existing path or an open file descriptor. Returns False for broken symbolic links. On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.

    All good, but if following the import tree:

    • os.path - posixpath.py (ntpath.py)

      • genericpath.py, line ~#20+

        def exists(path):
            """Test whether a path exists.  Returns False for broken symbolic links"""
            try:
                st = os.stat(path)
            except os.error:
                return False
            return True
        

    it's just a try / except block around [Python 3]: os.stat(path, *, dir_fd=None, follow_symlinks=True). So, your code is try / except free, but lower in the framestack there's (at least) one such block. This also applies to other funcs (including os.path.isfile).

    1.1. [Python 3]: Path.is_file()

    • It's a fancier (and more pythonic) way of handling paths, but
    • Under the hood, it does exactly the same thing (pathlib.py, line ~#1330):

      def is_file(self):
          """
          Whether this path is a regular file (also True for symlinks pointing
          to regular files).
          """
          try:
              return S_ISREG(self.stat().st_mode)
          except OSError as e:
              if e.errno not in (ENOENT, ENOTDIR):
                  raise
              # Path doesn't exist or is a broken symlink
              # (see https://bitbucket.org/pitrou/pathlib/issue/12/)
              return False
      
  2. [Python 3]: With Statement Context Managers. Either:

    • Create one:

      class Swallow:  # Dummy example
          swallowed_exceptions = (FileNotFoundError,)
      
          def __enter__(self):
              print("Entering...")
      
          def __exit__(self, exc_type, exc_value, exc_traceback):
              print("Exiting:", exc_type, exc_value, exc_traceback)
              return exc_type in Swallow.swallowed_exceptions  # only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)
      
      • And its usage - I'll replicate the os.path.isfile behavior (note that this is just for demonstrating purposes, do not attempt to write such code for production):

        import os
        import stat
        
        
        def isfile_seaman(path):  # Dummy func
            result = False
            with Swallow():
                result = stat.S_ISREG(os.stat(path).st_mode)
            return result
        
    • Use [Python 3]: contextlib.suppress(*exceptions) - which was specifically designed for selectively suppressing exceptions


    But, they seem to be wrappers over try / except / else / finally blocks, as [Python 3]: The with statement states:

    This allows common try...except...finally usage patterns to be encapsulated for convenient reuse.

  3. Filesystem traversal functions (and search the results for matching item(s))

    • [Python 3]: os.listdir(path='.') (or [Python 3]: os.scandir(path='.') on Python v3.5+, backport: [PyPI]: scandir)

      • Under the hood, both use:

        • Nix: [man7]: OPENDIR(3) / [man7]: READDIR(3) / [man7]: CLOSEDIR(3)
        • Win: [MS.Docs]: FindFirstFileW function / [MS.Docs]: FindNextFileW function / [MS.Docs]: FindClose function

        via [GitHub]: python/cpython - (master) cpython/Modules/posixmodule.c

      Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.

    • [Python 3]: os.walk(top, topdown=True, onerror=None, followlinks=False)
      • It uses os.listdir (os.scandir when available)
    • [Python 3]: glob.iglob(pathname, *, recursive=False) (or its predecessor: glob.glob)
      • Doesn't seem a traversing function per se (at least in some cases), but it still uses os.listdir


    Since these iterate over folders, (in most of the cases) they are inefficient for our problem (there are exceptions, like non wildcarded globbing - as @ShadowRanger pointed out), so I'm not going to insist on them. Not to mention that in some cases, filename processing might be required.

  4. [Python 3]: os.access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True) whose behavior is close to os.path.exists (actually it's wider, mainly because of the 2nd argument)

    • user permissions might restrict the file "visibility" as the doc states:

      ...test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path...

    os.access("/tmp", os.F_OK)

    Since I also work in C, I use this method as well because under the hood, it calls native APIs (again, via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it's not as Pythonic as other variants. So, as @AaronHall rightly pointed out, don't use it unless you know what you're doing:

    • Nix: [man7]: ACCESS(2) (!!! pay attention to the note about the security hole its usage might introduce !!!)
    • Win: [MS.Docs]: GetFileAttributesW function

    Note: calling native APIs is also possible via [Python 3]: ctypes - A foreign function library for Python, but in most cases it's more complicated.

    (Win specific): Since vcruntime* (msvcr*) .dll exports a [MS.Docs]: _access, _waccess function family as well, here's an example:

    Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os, ctypes
    >>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe", os.F_OK)
    0
    >>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe.notexist", os.F_OK)
    -1
    

    Notes:

    • Although it's not a good practice, I'm using os.F_OK in the call, but that's just for clarity (its value is 0)
    • I'm using _waccess so that the same code works on Python3 and Python2 (in spite of unicode related differences between them)
    • Although this targets a very specific area, it was not mentioned in any of the previous answers


    The Lnx (Ubtu (16 x64)) counterpart as well:

    Python 3.5.2 (default, Nov 17 2016, 17:05:23)
    [GCC 5.4.0 20160609] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os, ctypes
    >>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)
    0
    >>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)
    -1
    

    Notes:

    • Instead hardcoding libc's path ("/lib/x86_64-linux-gnu/libc.so.6") which may (and most likely, will) vary across systems, None (or the empty string) can be passed to CDLL constructor (ctypes.CDLL(None).access(b"/tmp", os.F_OK)). According to [man7]: DLOPEN(3):

      If filename is NULL, then the returned handle is for the main program. When given to dlsym(), this handle causes a search for a symbol in the main program, followed by all shared objects loaded at program startup, and then all shared objects loaded by dlopen() with the flag RTLD_GLOBAL.

      • Main (current) program (python) is linked against libc, so its symbols (including access) will be loaded
      • This has to be handled with care, since functions like main, Py_Main and (all the) others are available; calling them could have disastrous effects (on the current program)
      • This doesn't also apply to Win (but that's not such a big deal, since msvcrt.dll is located in "%SystemRoot%\System32" which is in %PATH% by default). I wanted to take things further and replicate this behavior on Win (and submit a patch), but as it turns out, [MS.Docs]: GetProcAddress function only "sees" exported symbols, so unless someone declares the functions in the main executable as __declspec(dllexport) (why on Earth the regular person would do that?), the main program is loadable but pretty much unusable
  5. Install some third-party module with filesystem capabilities

    Most likely, will rely on one of the ways above (maybe with slight customizations).
    One example would be (again, Win specific) [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions, which is a Python wrapper over WINAPIs.

    But, since this is more like a workaround, I'm stopping here.

  6. Another (lame) workaround (gainarie) is (as I like to call it,) the sysadmin approach: use Python as a wrapper to execute shell commands

    • Win:

      (py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe\" > nul 2>&1'))"
      0
      
      (py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe.notexist\" > nul 2>&1'))"
      1
      
    • Nix (Lnx (Ubtu)):

      [cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp\" > /dev/null 2>&1'))"
      0
      [cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp.notexist\" > /dev/null 2>&1'))"
      512
      

Bottom line:

  • Do use try / except / else / finally blocks, because they can prevent you running into a series of nasty problems. A counter-example that I can think of, is performance: such blocks are costly, so try not to place them in code that it's supposed to run hundreds of thousands times per second (but since (in most cases) it involves disk access, it won't be the case).

Final note(s):

  • I will try to keep it up to date, any suggestions are welcome, I will incorporate anything useful that will come up into the answer
like image 39
CristiFati Avatar answered Sep 21 '22 15:09

CristiFati


Python 3.4+ has an object-oriented path module: pathlib. Using this new module, you can check whether a file exists like this:

import pathlib
p = pathlib.Path('path/to/file')
if p.is_file():  # or p.is_dir() to see if it is a directory
    # do stuff

You can (and usually should) still use a try/except block when opening files:

try:
    with p.open() as f:
        # do awesome stuff
except OSError:
    print('Well darn.')

The pathlib module has lots of cool stuff in it: convenient globbing, checking file's owner, easier path joining, etc. It's worth checking out. If you're on an older Python (version 2.6 or later), you can still install pathlib with pip:

# installs pathlib2 on older Python versions
# the original third-party module, pathlib, is no longer maintained.
pip install pathlib2

Then import it as follows:

# Older Python versions
import pathlib2 as pathlib
like image 20
Cody Piersall Avatar answered Sep 17 '22 15:09

Cody Piersall


This is the simplest way to check if a file exists. Just because the file existed when you checked doesn't guarantee that it will be there when you need to open it.

import os
fname = "foo.txt"
if os.path.isfile(fname):
    print("file does exist at this time")
else:
    print("no such file exists at this time")
like image 24
un33k Avatar answered Sep 21 '22 15:09

un33k