Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pip's requirements.txt best practice

I am trying to generate requirements.txt for someone to replicate my environment. As you may know, the standard way is

pip freeze > requirements.txt

I noticed that this will list all the packages, including the dependencies of installed packages, which makes this list unnecessary huge. I then browsed around and came across pip-chill that allows us to only list installed packages in requirements.txt.

Now, from my understanding when someone tries to replicate the environment with pip install -r requirements.txt, this will automatically install the dependencies of installed packages.

If this is true, this means it is safe to use pip-chill instead of pip to generate the requirements.txt. My question is, is there any other risk of omitting dependencies of installed packages using pip-chill that I am missing here?

like image 824
Darren Christopher Avatar asked May 01 '20 03:05

Darren Christopher


2 Answers

I believe using pip-compile from pip-tools is a good practice when constructing your requirements.txt. This will make sure that builds are predictable and deterministic.

The pip-compile command lets you compile a requirements.txt file from your dependencies, specified in either setup.py or requirements.in

Here's my recommended steps in constructing your requirements.txt (if using requirements.in):

  1. Create a virtual env and install pip-tools there
$ source /path/to/venv/bin/activate
(venv)$ python -m pip install pip-tools
  1. Specify your application/project's direct dependencies your requirements.in file:
# requirements.in
requests
boto3==1.16.51
  1. Use pip-compile to generate requirements.txt
$ pip-compile --output-file=- > requirements.txt

your requirements.txt files will have:

#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --output-file=-
#
boto3==1.16.51
    # via -r requirements.in
botocore==1.19.51
    # via
    #   boto3
    #   s3transfer
certifi==2020.12.5
    # via requests
chardet==4.0.0
    # via requests
idna==2.10
    # via requests
jmespath==0.10.0
    # via
    #   boto3
    #   botocore
python-dateutil==2.8.1
    # via botocore
requests==2.25.1
    # via -r requirements.in
s3transfer==0.3.3
    # via boto3
six==1.15.0
    # via python-dateutil
urllib3==1.26.2
    # via
    #   botocore
    #   requests

Your application should always work with the dependencies installed from this generated requirements.txt. If you have to update a dependency you just need to update the requirements.in file and redo pip-compile. I believe this is a much better approach than doing pip freeze > requirements.txt which I see some people do.

I guess the main advantage of using this is you can keep track of the actual direct dependencies of your project in a separate requirement.in file

I find this very similar to how node modules/dependencies are being managed in a node app project with the package.json (requirements.in) and package-lock.json (requirements.txt).

like image 109
alegria Avatar answered Sep 23 '22 12:09

alegria


From my point of view requirements.txt files should list all dependencies, direct dependencies as well as their dependencies (indirect, transient). If for some reason, only direct dependencies are wanted there are tools that can help with that, from a cursory look, pip-chill seems inadequate since it doesn't actually look at the code to figure out what packages are directly imported. Maybe better look at projects such as pipreqs, pigar, they seem to be more accurate in figuring out what the actual direct dependencies are (based on the imports in your code).

But at the end of the day you should curate such lists by hand. When writing the code you choose carefully which packages you want to import, with the same care you should curate a list of the projects (and their versions) containing those packages. Tools can help, but the developer knows better.

like image 22
sinoroc Avatar answered Sep 24 '22 12:09

sinoroc