I frequently have to re-create virtual environments from a requirements.txt
and I am already using $PIP_DOWNLOAD_CACHE
. It still takes a lot of time and I noticed the following:
Pip spends a lot of time between the following two lines:
Downloading/unpacking SomePackage==1.4 (from -r requirements.txt (line 2))
Using download cache from $HOME/.pip_download_cache/cached_package.tar.gz
Like ~20 seconds on average to decide it's going to use the cached package, then the install is fast. This is a lot of time when you have to install dozens of packages (actually enough to write this question).
What is going on in the background? Are they some sort of integrity checks against the online package?
Is there a way to speed this up?
edit: Looking at:
time pip install -v Django==1.4
I get:
real 1m16.120s
user 0m4.312s
sys 0m1.280s
The full output is here http://pastebin.com/e4Q2B5BA. Looks like pip is spending his time looking for a valid download link while it already has a valid cache of http://pypi.python.org/packages/source/D/Django/Django-1.4.tar.gz.
Is there a way to look for the cache first and stop there if versions match?
After spending some time to study the pip internals and to profile some package installations I came to the conclusion that even with a download cache, pip does the following for each package :
Now pip has a download url, checks against the download cache folder if configured and eventually decides not to use this url if a local file named after the url is present.
My guess is that we could save a lot of time by checking the cache upfront but I do not have a good enough understanding of all the pip code base to start the required modifications. Of course it would only be for exact version number requirements, ==
, because with other constraints, like >=
or >
, we still want to crawl the web looking for the latest version.
Nevertheless, I was able to make a small pull request which will save us some time if merged.
One alternative may be to avoid rebuilding the virtualenv and to instead take a copy of a master virtual environment that you can update and copy as required.
virtualenvwrapper provides some support for doing this with the cpvirtualenv command
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With