I'm perplexed about how best to use pip
in the face of security concerns about malicious packages or install scripts. I'm not much of a security expert, so I may just be confused (bear with me), but it seems that there are 4, possibly overlapping, approaches:
(1) Use sudo pip
for everything
This is how I do things now. I generally do not need virtualenvs and like the convenience of having all my packages work for all my tools. I also don't install a lot of experimental packages, sticking pretty much to the well-known and widely used ones (matplotlib
, six
, etc).
I gather this can be a risky approach though because the installation process has su
privileges, and could potentially do anything; however it has the advantage of protecting the site-packages
directory from subsequent mischief by anything (not just packages) running as non-su
after an install.
This approach also can't be completely avoided, as some packages (pip
itself) need it to bootstrap any Python installation.
(2) Create a pip
user and give it ownership of site-packages
This would seem to have the advantage of restricting what pip
can do: all it can do is install to site-packages
. But I'm not sure about side effects, or if it would even work (when, for example pip
needs to put things in other locations). A more realistic variant of this is to set things up this way, and use pip
as "pip-user" when it works, and as su
when it doesn't.
(3) Give myself ownership of site-packages
I gather this is a very had idea, but I'm not sure quite why. It would mean that any code I run would be able to tamper with site-packages
; but it would mean that malicious install scripts could only damage things I can damage myself anyway.
(4) "Use a virtualenv"
This suggestion comes up a lot, but I don't see how it helps. It seems no different from 3 to me since it creates a site-packages
that I own.
Which, if any of these approaches, or combinations of approaches, is best for ensuring that pip
does not result in exposing my system? My concern is mostly with my system as a whole, and only secondarily with my Python installation in site-packages
(which I can always rebuild if need be).
Part of the problem I have, is that a don't know how to weigh the risks. An example approach, that seems to make sense to my limited understanding is simply to do (1) for the most part, and use a virtualenv (4) for any package that I worry might damage my site-packages
. Anything I've installed will still be able to damage anything I have access to, but that seems unavoidable, and at least things I don't have access to will be safe (except during the installation process itself). But I have trouble evaluating whether the protection this affords is worth the risk it creates.
You probably want to look at using a virtualenv. To quote the docs:
Virtualenv is a tool to create isolated Python environments. The basic problem being addressed is one of dependencies and versions, and indirectly permissions.
Virtualenv will create a folder with an isolated copy of python, an isolated pip and an isolated site-packages. You're thinking that this is the same as option 3 because you're taking that advice you linked at face value and not reading into it:
If you give yourself write privilege to the system site-packages, you're risking that any program that runs under you (not necessarily python program) can inject malicious code into the system site-packages and obtain root privilege.
The problem is not with having access to site-packages (you have to have privilages for site-packages to be able to do anything). The problem is with having access to the system site-packages. A virtual environment's site-packages does not expose root privilages to malicious code the same as the one that your entire system is using.
However, I see nothing wrong with using sudo pip
for well known and familiar packages. At the end of the day, it's like installing any other program, even non-python. If you go to its website and it looks honest and you trust it, there's no reason not to sudo.
moreover, pip is pretty safe - it uses https for pypi and if you --allow-external
it will download packages from third-party, but will still keep checksums on pypi and compare them. For third-party with no checksum you need to explicitly call --allow-unverified
which is the only option considered unsafe.
As a personal note, I can add that I use sudo pip most of the times, but as a WEB developer virtualenv is kind of a day-to-day thing, and I can recommend using it as well (especially if you see anything sketchy but you still want to try it out).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With