How do I determine which requirements are actually needed in setup.py?

Tags:

I'm cleaning up packaging for a python project I didn't create. Currently, it does some explicitly unsupported magic to get its dependencies from a requirements.txt file. The file looks like it may have been generated by pip freeze; there are fixed versions for everything, and many apparently-extraneous packages listed. I am pretty sure some of these aren't real dependencies, but I don't know which ones.

Given just the source tree, how would I figure out, from scratch, what dependencies ought to be included in install_requires?

As a first stab, I'm grepping for non-stdlib import statements. I hope there's a better way.

329

asked Jun 21 '18 21:06

Andrew

2 Answers

There's no way to do this perfectly, because Python is too flexible.

But it's usually possible to do it well enough.

You can use start with the stdlib's modulefinder.

Beyond that, a number of projects—mostly projects designed for building binary executables, installers, etc. for Python apps—have come up with heuristics that go even farther.

These usually work. And, when they fail, you usually immediately spot it on your first test. Even if they aren't sufficient, they're at the very least good sample code. Here are a few off the top of my head:

cx_Freeze
py2exe
py2app
pyInstaller

In case you're wondering why it's impossible:

Even forgetting about the program of dependencies in C extension modules, Python is just too flexible to catch all the ways you could import a module via static analysis.

Sure, you'd have to be dealing with code written by someone crazy enough to use explicitly unsupported magic for no good reason… but if you were, there's nothing to stop someone from writing this instead of import lxml:¹

with open('picture.jpg', encoding='cp500') as f:
    getattr(sys.modules[11], codecs.encode('vzcbeg_zbqhyr', 'rot13'))(f.read().strip())

In reality, things aren't going to be that bad. But they could easily be too bad for rg import to be sufficient.

You could try to detect all the imports dynamically with a simple import hook, but that's only guaranteed to work if you can exercise 100% of the code paths.

_{1. Of course this only works if importlib was the 12th module loaded, and if picture.jpg is not a JPEG image but a textfile whose contents are, in EBCDIC, lxml\n}

133

answered Nov 02 '22 08:11

abarnert

I've had great results with pipreqs that will automatically generate a requirements.txt file from your source code.

pipreqs /home/project/location
Successfully saved requirements file in /home/project/location/requirements.txt

answered Nov 02 '22 09:11

Ereli

Related questions
                            
                                Django error message: ["'on' value must be either True or False."]
                            
                                Yield from Async Generator in Python AsyncIO
                            
                                How to convert a pyw file to exe?
                            
                                Pandas read excel sheet with multiple header when first column is empty
                            
                                split string and make key value pair
                            
                                The pythonic way to access a class attribute within the class
                            
                                Where to put the doc string for a decorator
                            
                                Interpolate time series, select y value from x
                            
                                Why shouldn't one dynamically generate variable names in python?
                            
                                How to unpickle a file that has been hosted in a web URL in python
                            
                                Pandas: Group by a column that meets a condition
                            
                                Matplotlib timedelta64 index as x-axis
                            
                                413 Request Entity Too Large uploading files with Django Admin and Nginx Configuration
                            
                                Most Pythonic way to multiply these two vectors?
                            
                                Selecting rows in a Pandas DataFrame based on conditions of the index values
                            
                                numpy - select multiple elements from each row of an array
                            
                                How to plot a vertical area plot with pandas
                            
                                One-liner to create dictionary of lists
                            
                                Using aws encryption SDK in python AWS lambda
                            
                                Why does pygame freeze for me? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With