Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this the approved way to acess data adjacent to/packaged with a Python script?

I have a Python script that needs some data that's stored in a file that will always be in the same location as the script. I have a setup.py for the script, and I want to make sure it's pip installable in a wide variety of environments, and can be turned into a standalone executable if necessary.

Currently the script runs with Python 2.7 and Python 3.3 or higher (though I don't have a test environment for 3.3 so I can't be sure about that).

I came up with this method to get the data. This script isn't part of a module directory with __init__.py or anything, it's just a standalone file that will work if just run with python directly, but also has an entry point defined in the setup.py file. It's all one file. Is this the correct way?

def fetch_wordlist():
    wordlist = 'wordlist.txt'
    try:
        import importlib.resources as res
        return res.read_binary(__file__, wordlist)
    except ImportError:
        pass
    try:
        import pkg_resources as resources
        req = resources.Requirement.parse('makepw')
        wordlist = resources.resource_filename(req, wordlist)
    except ImportError:
        import os.path
        wordlist = os.path.join(os.path.dirname(__file__), wordlist)
    with open(wordlist, 'rb') as f:
        return f.read()

This seems ridiculously complex. Also, it seems to rely on the package management system in ways I'm uncomfortable with. The script no longer works unless it's been pip-installed, and that also doesn't seem desirable.

like image 871
Omnifarious Avatar asked Apr 07 '19 23:04

Omnifarious


People also ask

How do you share a project in Python?

Use the "Share" button in the Editor The PythonAnywhere editor (from the Files tab) gives you the option to share a file -- look for the paperclip icon at the top of the editor. This only works for single files, and people you share with will need a PythonAnywhere account.


1 Answers

Resources living on the filesystem

The standard way to read a file adjacent to your python script would be:

a) If you've got python>=3.4 I'd suggest you use the pathlib module, like this:

from pathlib import Path


def fetch_wordlist(filename="wordlist.txt"):
    return (Path(__file__).parent / filename).read_text()


if __name__ == '__main__':
    print(fetch_wordlist())

b) And if you're still using a python version <3.4 or you still want to use the good old os.path module you should do something like this:

import os


def fetch_wordlist(filename="wordlist.txt"):
    with open(os.path.join(os.path.dirname(__file__), filename)) as f:
        return f.read()


if __name__ == '__main__':
    print(fetch_wordlist())

Also, I'd suggest you capture exceptions in the outer callers, the above methods are standard way to read files in python so you don't need wrap them in a function like fetch_wordlist, said otherwise, reading files in python is an "atomic" operation.

Now, it may happen that you've frozen your program using some freezer such as cx_freeze, pyinstaller or similars... in that case you'd need to detect that, here's a simple way to check it out:

a) using os.path:

if getattr(sys, 'frozen', False):
    app_path = os.path.dirname(sys.executable)
elif __file__:
    app_path = os.path.dirname(__file__)

b) using pathlib:

if getattr(sys, 'frozen', False):
    app_path = Path(sys.executable).parent
elif __file__:
    app_path = Path(__file__).parent

Resources living inside a zip file

The above solutions would work if the code lives on the file system but it wouldn't work if the package is living inside a zip file, when that happens you could use either importlib.resources (new in version 3.7) or pkg_resources combo as you've already shown in the question (or you could wrap up in some helpers) or you could use a nice 3rd party library called importlib_resources that should work with the old&modern python versions:

  • pypi: https://pypi.org/project/importlib_resources/
  • documentation: https://importlib-resources.readthedocs.io/en/latest/

Specifically for your particular problem I'd suggest you take a look to this https://importlib-resources.readthedocs.io/en/latest/using.html#file-system-or-zip-file.

If you want to know what that library is doing behind the curtains because you're not willing to install any 3rd party library you can find the code for py2 here and py3 here in case you wanted to get the relevant bits for your particular problem

like image 50
BPL Avatar answered Oct 08 '22 11:10

BPL