After searching and not finding, I must ask here:
How does conda env
work under the hood, meaning, how does anaconda handle environments?
To clarify, I would like an answer or a reference to questions like:
What is kept in the envs/myenv
folder?
What happens upon activate myenv
?
What happens upon conda install ...
?
Where can i find such information?
A conda environment is a directory that contains a specific collection of conda packages that you have installed. For example, you may have one environment with NumPy 1.7 and its dependencies, and another environment with NumPy 1.6 for legacy testing.
The environments created by Conda is always located in /Users/.../anaconda3/envs/ . You may change the default location by using the following command but it is not encouraged.
Conda is Better at Dependency Management Instead, pip may allow incompatible dependencies to be installed depending on the order you install packages. Conda instead uses what they call a “satisfiability solver”, which checks that all dependencies are met at all times.
Avoid installing packages into your base Conda environment Conda has a default environment called base that include a Python installation and some core system libraries and dependencies of Conda. It is a “best practice” to avoid installing additional packages into your base software environment.
Basically, conda
environments replicate the structure of your system, meaning it will store /bin
, /lib
, /etc
, /var
, among other directories. This is more obvious for unix systems, but the same concept is true under windows (DLLs
, libs
, Scripts
, ...).
More details in the official documentation.
The idea is that conda install PACKAGE
will fetch a precompiled package from a channel
(a conda packages repository), and install it under this system-like structure. Instead of relying on system dependencies, conda
will install all dependencies of this package under the environment structure, using only conda packages.
Thus installing the same package at a given time point under different systems should result in reliably identical installs.
This is a way to standardize binaries, and it is only achieved by precompiling every package against given versions of libraries, which are shipped as dependencies of the conda environment. For instance, conda-forge
and bioconda
channels rely on cloud-based CI/CD pipelines to compile all packages on identical and completely clean system images.
Conda also stores metadata about these packages (version, build number, dependencies, license,...) so it is able to solve pretty complex dependency trees and avoid packages/libraries incompatibilities. It is the Solving...
step each time you execute conda install
.
Then when you conda activate ENV
, conda prepends the environment root $CONDA_PREFIX/bin
to PATH
, so that all executables installed in the environment will be found by the system (and will overload system-wide install of the same executable).
You can imagine it like temporarily replacing the system executables with those from the environment.
This a very basic explanation, not 100% accurate, and certainly not complete. If you want to learn more, go read the documentation, experiment with conda
, and maybe have an in-depth look to how Conda-forge and Bioconda do build packages, as everything is hosted on github.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With