Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should the Conda (base) environment be kept up to date?

I'm happily using Conda via the miniconda install to manage python environments.

After install, I leave the base environment alone and create new environments for new projects. Then I conda env update these environments as needed. However, I'm not sure this is the right approach.

Should the base environment be conda env updateed before creating new environments?

I think this would keep disk usage lower as, my possibly incorrect understanding is, Conda links packages to the base environment when creating new environments if the package and dependencies exactly match.

Although... that doesn't make much sense as they could easily get out of sync. Maybe it just saves on bandwidth as matching packages can be copied instead of downloaded?

If every project has it's own environment does it matter if the base environment is kept up to date?

like image 823
GollyJer Avatar asked May 21 '19 06:05

GollyJer


People also ask

Should I use base conda environment?

Conda has a default environment called base that include a Python installation and some core system libraries and dependencies of Conda. It is a “best practice” to avoid installing additional packages into your base software environment.

What does conda update base conda do?

What does conda update base conda do? Updates conda packages to the latest compatible version. This command accepts a list of package names and updates them to the latest versions that are compatible with all other packages in the environment. Conda attempts to install the newest versions of the requested packages.

Can I delete base environment conda?

How do you delete a base environment in terminal? To exit the virtual environment, use the command conda deactivate . If you run conda info –envs again, there is no * in front of env_name .


1 Answers

Conda links all packages to the pkgs folder, which is shared by all envs and is not associated with base in any special way. Whenever any env installs or upgrades packages they'll go there, and there isn't any explicit effort to source from existing packages - if the dependency solver happens to resolve to a cached package it will use it. Currently, there is no mechanism for maintaining synchronization of packages across envs, so one would have to design a workflow to achieve it.

Potential Workflow

One could, in theory, use Conda's env cloning to maximize package version synchronization. To this end, you could conceptually organize your envs into three categories:

  • base env: only used for core infrastructure, e.g., conda, jupyter, git, etc.. This you would freely update whenever you wanted new commandline software or need to conda update conda. It should have little to no overlap with other envs.
  • template env: centralizes common sets of packages, typically grouped by version restrictions. For example, one might have a py27-tmpl, py36-tmpl, and py37-tmpl for different versions of Python that you might require for different projects. Here you would install the greatest common subset of packages you require across projects. The main purpose of a template env would be to make a...
  • project env: associated with a specific development project, and derived initially as a clone of a template env. Most of the core software in these would come from the template, and then additional software should be installed here. Once you start one for a project, you keep it relatively fixed, in order to maintain development stability.

Such a structure would maximize the reuse of existing package versions. Starting with Conda v4.7, the dependency solver defaults to a first-stage solve with an implicit --freeze-installed|--no-update-deps flag, which attempts to install the requested packages without having to change existing packages. If keeping sychronized with template env is your goal, then you may want to always use the --freeze-installed when installing. One could also use package pinning which explicitly prevent specified packages from upgrading away from the template. However, this could restrict installation of some latest versions for other packages.

Unfortunately, you'd still run into a similar synchronization problem as you intuited: while you could update these template envs before making new clones, that won't update the ones previously derived from them. But for project envs, I think best practice would be not to manipulate them once you start working. If you're concerned with space, there's no substitute for getting your modular projects completed and then archiving and deleting project envs after use. That, and occasionally running conda clean.

like image 120
merv Avatar answered Oct 13 '22 00:10

merv