Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I determine which requirements are actually needed in setup.py?

Tags:

python

pip

I'm cleaning up packaging for a python project I didn't create. Currently, it does some explicitly unsupported magic to get its dependencies from a requirements.txt file. The file looks like it may have been generated by pip freeze; there are fixed versions for everything, and many apparently-extraneous packages listed. I am pretty sure some of these aren't real dependencies, but I don't know which ones.

Given just the source tree, how would I figure out, from scratch, what dependencies ought to be included in install_requires?

As a first stab, I'm grepping for non-stdlib import statements. I hope there's a better way.

like image 329
Andrew Avatar asked Jun 21 '18 21:06

Andrew


People also ask

What is install requires in setup py?

install_requires is a setuptools setup.py keyword that should be used to specify what a project minimally needs to run correctly. When the project is installed by pip, this is the specification that is used to install its dependencies.

What is the difference between setup py and requirements txt?

The short answer is that requirements. txt is for listing package requirements only. setup.py on the other hand is more like an installation script. If you don't plan on installing the python code, typically you would only need requirements.

Does pip depend on setuptools?

We use setup.py and pip to manage development dependencies for our packages, though you need a newer version of pip (we're using 1.4. 1 currently). That command will install everything, gevent , flask , Fabric , and nose . Pip internally uses setuptools for their build system pip.pypa.io/en/latest/reference/pip/….

Where should requirements txt be?

Typically this file "requirement. txt" is stored (or resides) in the root directory of your projects.


2 Answers

There's no way to do this perfectly, because Python is too flexible.

But it's usually possible to do it well enough.

You can use start with the stdlib's modulefinder.

Beyond that, a number of projects—mostly projects designed for building binary executables, installers, etc. for Python apps—have come up with heuristics that go even farther.

These usually work. And, when they fail, you usually immediately spot it on your first test. Even if they aren't sufficient, they're at the very least good sample code. Here are a few off the top of my head:

  • cx_Freeze
  • py2exe
  • py2app
  • pyInstaller

In case you're wondering why it's impossible:

Even forgetting about the program of dependencies in C extension modules, Python is just too flexible to catch all the ways you could import a module via static analysis.

Sure, you'd have to be dealing with code written by someone crazy enough to use explicitly unsupported magic for no good reason… but if you were, there's nothing to stop someone from writing this instead of import lxml:1

with open('picture.jpg', encoding='cp500') as f:
    getattr(sys.modules[11], codecs.encode('vzcbeg_zbqhyr', 'rot13'))(f.read().strip())

In reality, things aren't going to be that bad. But they could easily be too bad for rg import to be sufficient.

You could try to detect all the imports dynamically with a simple import hook, but that's only guaranteed to work if you can exercise 100% of the code paths.


1. Of course this only works if importlib was the 12th module loaded, and if picture.jpg is not a JPEG image but a textfile whose contents are, in EBCDIC, lxml\n

like image 133
abarnert Avatar answered Nov 02 '22 08:11

abarnert


I've had great results with pipreqs that will automatically generate a requirements.txt file from your source code.

pipreqs /home/project/location
Successfully saved requirements file in /home/project/location/requirements.txt
like image 35
Ereli Avatar answered Nov 02 '22 09:11

Ereli