How to download a distribution, possibly sdist, without potentially executing a setup.py
file (that may contain malicious code)?
I don't want to recursively get the dependencies, only download one file for the specified distribution. Attempt that doesn't work:
pip download --no-deps mydist
Here is a reproducible example that demonstrates the setup.py
is still executed in the above case:
$ docker run --rm -it python:3.8-alpine sh
/ # pip --version
pip 20.0.2 from /usr/local/lib/python3.8/site-packages/pip (python 3.8)
/ # pip download --no-deps suds
Collecting suds
Downloading suds-0.4.tar.gz (104 kB)
|████████████████████████████████| 104 kB 13.4 MB/s
ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-download-yqfdz35d/suds/setup.py'"'"'; __file__='"'"'/tmp/pip-download-yqfdz35d/suds/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-download-yqfdz35d/suds/pip-egg-info
cwd: /tmp/pip-download-yqfdz35d/suds/
Complete output (7 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-download-yqfdz35d/suds/setup.py", line 20, in <module>
import suds
File "/tmp/pip-download-yqfdz35d/suds/suds/__init__.py", line 154, in <module>
import client
ModuleNotFoundError: No module named 'client'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
I cannot use --no-binary
option, because I don't want to exclude source distributions. I just want to avoid executing their source code.
I've been digging into pip
, and sadly the code there is pretty convoluted. It seems that currently there is no way to do that, and according to the link provided by @doctaphred there are no plans to make progress in that direction.
The next step depends on your situation; If, for example, you need this "package downloader" for production, I'd suggest you write your own pypi client. It would be very simple to write and you could make it much faster and simpler than pip
by optimizing it for your needs. To do that you could try to use some of the existing code in pip
, but I think it will probably be pretty hard (after seeing that code).
Otherwise, I'd consider quicker, hackier methods to get the job done. The first solution that comes to mind is just stopping pip
whenever it tries to run the egg_info
command. To do that you can patch pip
's code at runtime using various methods. My favorite is using a usercutomize
file.
For example, create a patch file with the following content and place it in a directory of your choosing:
/pypatches/pip_pure_download/usercustomize.py
:
from pip._internal.req.req_install import InstallRequirement
print('Applying pure download patch!')
def override_run_egg_info(*args, **kwargs):
raise KeyboardInterrupt # Joke's on you, evil hackers! :P
InstallRequirement.run_egg_info = override_run_egg_info
Now to apply the patch to a python execution, just add the patch's directory to the PYTHONPATH
, for example:
PYTHONPATH=/pypatches/pip_pure_download:$PYTHONPATH pip download --no-deps suds
This doesn't seem to be possible as of pip 19.3.1 :(
See https://github.com/pypa/pip/issues/1884
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With