I'm cleaning up packaging for a python project I didn't create. Currently, it does some explicitly unsupported magic to get its dependencies from a requirements.txt file. The file looks like it may have been generated by pip freeze; there are fixed versions for everything, and many apparently-extraneous packages listed. I am pretty sure some of these aren't real dependencies, but I don't know which ones.
Given just the source tree, how would I figure out, from scratch, what dependencies ought to be included in install_requires?
As a first stab, I'm grepping for non-stdlib import statements. I hope there's a better way.
install_requires is a setuptools setup.py keyword that should be used to specify what a project minimally needs to run correctly. When the project is installed by pip, this is the specification that is used to install its dependencies.
The short answer is that requirements. txt is for listing package requirements only. setup.py on the other hand is more like an installation script. If you don't plan on installing the python code, typically you would only need requirements.
We use setup.py and pip to manage development dependencies for our packages, though you need a newer version of pip (we're using 1.4. 1 currently). That command will install everything, gevent , flask , Fabric , and nose . Pip internally uses setuptools for their build system pip.pypa.io/en/latest/reference/pip/….
Typically this file "requirement. txt" is stored (or resides) in the root directory of your projects.
There's no way to do this perfectly, because Python is too flexible.
But it's usually possible to do it well enough.
You can use start with the stdlib's modulefinder
.
Beyond that, a number of projects—mostly projects designed for building binary executables, installers, etc. for Python apps—have come up with heuristics that go even farther.
These usually work. And, when they fail, you usually immediately spot it on your first test. Even if they aren't sufficient, they're at the very least good sample code. Here are a few off the top of my head:
In case you're wondering why it's impossible:
Even forgetting about the program of dependencies in C extension modules, Python is just too flexible to catch all the ways you could import a module via static analysis.
Sure, you'd have to be dealing with code written by someone crazy enough to use explicitly unsupported magic for no good reason… but if you were, there's nothing to stop someone from writing this instead of import lxml
:1
with open('picture.jpg', encoding='cp500') as f:
getattr(sys.modules[11], codecs.encode('vzcbeg_zbqhyr', 'rot13'))(f.read().strip())
In reality, things aren't going to be that bad. But they could easily be too bad for rg import
to be sufficient.
You could try to detect all the imports dynamically with a simple import hook, but that's only guaranteed to work if you can exercise 100% of the code paths.
1. Of course this only works if importlib
was the 12th module loaded, and if picture.jpg
is not a JPEG image but a textfile whose contents are, in EBCDIC, lxml\n
I've had great results with pipreqs
that will automatically generate a requirements.txt file from your source code.
pipreqs /home/project/location
Successfully saved requirements file in /home/project/location/requirements.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With