I've noticed that normally when packages are installed using various package managers (for python), they are installed in /home/user/anaconda3/envs/env_name/
on conda and in /home/user/anaconda3/envs/env_name/lib/python3.6/lib-packages/
using pip on conda.
But conda caches all the recently downloaded packages too.
So, my question is: Why doesn't conda install all the packages on a central location and then when installed in a specific environment create a link to the directory rather than installing it there?
I've noticed that environments grow quite big and that this method would probably be able to save a bit of space.
Avoid installing packages into your base Conda environment Conda has a default environment called base that include a Python installation and some core system libraries and dependencies of Conda. It is a “best practice” to avoid installing additional packages into your base software environment.
Conda installs packages into the anaconda/pkgs directory. If conda cannot find the file, try using an absolute path name instead of a relative path name. Installing packages directly from the file does not resolve dependencies.
To automatically add default packages to each new environment that you create: Open Anaconda Prompt or terminal and run: conda config --add create_default_packages PACKAGENAME1 PACKAGENAME2. Now, you can create new environments and the default packages will be installed in all of them.
Conda already does this. However, because it leverages hardlinks, it is easy to overestimate the space really being used, especially if one only looks at the size of a single env at a time.
To illustrate the case, let's use du
to inspect the real disk usage. First, if I count each environment directory individually, I get the uncorrected per env usage
$ for d in envs/*; do du -sh $d; done 2.4G envs/pymc36 1.7G envs/pymc3_27 1.4G envs/r-keras 1.7G envs/stan 1.2G envs/velocyto
which is what it might look like from a GUI.
Instead, if I let du
count them together (i.e., correcting for the hardlinks), we get
$ du -sh envs/* 2.4G envs/pymc36 326M envs/pymc3_27 820M envs/r-keras 927M envs/stan 548M envs/velocyto
One can see that a significant amount of space is already being saved here.
Most of the hardlinks go back to the pkgs
directory, so if we include that as well:
$ du -sh pkgs envs/* 8.2G pkgs 400M envs/pymc36 116M envs/pymc3_27 92M envs/r-keras 62M envs/stan 162M envs/velocyto
one can see that outside of the shared packages, the envs are fairly light. If you're concerned about the size of my pkgs
, note that I have never run conda clean
on this system, so my pkgs
directory is full of tarballs and superseded packages, plus some infrastructure I keep in base (e.g., Jupyter, Git, etc).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With