Should the Conda (base) environment be kept up to date?

Tags:

I'm happily using Conda via the miniconda install to manage python environments.

After install, I leave the base environment alone and create new environments for new projects. Then I conda env update these environments as needed. However, I'm not sure this is the right approach.

Should the base environment be conda env updateed before creating new environments?

I think this would keep disk usage lower as, my possibly incorrect understanding is, Conda links packages to the base environment when creating new environments if the package and dependencies exactly match.

Although... that doesn't make much sense as they could easily get out of sync. Maybe it just saves on bandwidth as matching packages can be copied instead of downloaded?

If every project has it's own environment does it matter if the base environment is kept up to date?

823

asked May 21 '19 06:05

GollyJer

1 Answers

Conda links all packages to the pkgs folder, which is shared by all envs and is not associated with base in any special way. Whenever any env installs or upgrades packages they'll go there, and there isn't any explicit effort to source from existing packages - if the dependency solver happens to resolve to a cached package it will use it. Currently, there is no mechanism for maintaining synchronization of packages across envs, so one would have to design a workflow to achieve it.

Potential Workflow

One could, in theory, use Conda's env cloning to maximize package version synchronization. To this end, you could conceptually organize your envs into three categories:

base env: only used for core infrastructure, e.g., conda, jupyter, git, etc.. This you would freely update whenever you wanted new commandline software or need to conda update conda. It should have little to no overlap with other envs.
template env: centralizes common sets of packages, typically grouped by version restrictions. For example, one might have a py27-tmpl, py36-tmpl, and py37-tmpl for different versions of Python that you might require for different projects. Here you would install the greatest common subset of packages you require across projects. The main purpose of a template env would be to make a...
project env: associated with a specific development project, and derived initially as a clone of a template env. Most of the core software in these would come from the template, and then additional software should be installed here. Once you start one for a project, you keep it relatively fixed, in order to maintain development stability.

Such a structure would maximize the reuse of existing package versions. Starting with Conda v4.7, the dependency solver defaults to a first-stage solve with an implicit --freeze-installed|--no-update-deps flag, which attempts to install the requested packages without having to change existing packages. If keeping sychronized with template env is your goal, then you may want to always use the --freeze-installed when installing. One could also use package pinning which explicitly prevent specified packages from upgrading away from the template. However, this could restrict installation of some latest versions for other packages.

Unfortunately, you'd still run into a similar synchronization problem as you intuited: while you could update these template envs before making new clones, that won't update the ones previously derived from them. But for project envs, I think best practice would be not to manipulate them once you start working. If you're concerned with space, there's no substitute for getting your modular projects completed and then archiving and deleting project envs after use. That, and occasionally running conda clean.

120

answered Oct 13 '22 00:10

merv

Related questions
                            
                                pandas groupby aggregate customised function with multiple columns
                            
                                Adding a trend line to a matplotlib line plot python
                            
                                Pandas groupby and calculate percentage change
                            
                                Handling NaN Values in Pandas with Conditional Statement
                            
                                How to plot tan(x) with pyplot and numpy
                            
                                Raise error in ternary statement in python, without using classic if/else syntax
                            
                                Flask API failing to decode JSON data. Error: "message": "Failed to decode JSON object: Expecting value: line 1 column 1 (char 0)"
                            
                                strings in sorted order, except group all the strings that begin with 'x' first
                            
                                Select rows from DataFrame where ID count is greater than X
                            
                                Python basemap in google colaboratory
                            
                                Implement dropout to fully connected layer in PyTorch
                            
                                Pandas Plot: scatter plot with index [duplicate]
                            
                                How to generate legible plots in pandas when looping over columns?
                            
                                check element-wise for existence of string
                            
                                Why is .loc slicing in pandas inclusive of stop, contrary to typical python slicing?
                            
                                Python efficient way of writing switch case with comparison
                            
                                How can i solve backward() got an unexpected keyword argument 'retain_variables'?
                            
                                Converting cftime.DatetimeJulian to datetime
                            
                                Can't reach Locust WebInterface "ERR_CONNECTION_REFUSED"
                            
                                Add arbitrary lines on seaborn jointplot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Should the Conda (base) environment be kept up to date?

Tags:

python

conda

anaconda

package-management

GollyJer

People also ask

1 Answers

Potential Workflow

merv

Recent Activity

Donate For Us